0

Incorrect handling of strings with different cases in in-memory

When loading data into IMDB, there is an issue with handling strings with different cases. If you create a table with identical strings but with different cases (e.g., 'Test' and 'test'), and then load it into the model, in Discovery, strings with lowercase letters, such as 'test' and 'test1', do not appear.

10 replies

null
    • Daniel_Chitlaru
    • 3 mths ago
    • Reported - view

    String comparison in IMDB is case-insensitive by default. ETL and previews that iterate over existing values will return all values as they were inserted, but discovery queries with GROUP BY etc will cause it to merge values that differ only by case.

    Currently the only way to change it is by editing irdb.ini (located in Pyramid installation directory/imdb)

    Add a line: collation=X where X can be one of the following:

    OrdinalIgnoreCase   (Default)

    Ordinal (This is probably what you want)

    InvariantCulture

    InvariantCultureIgnoreCase

    CurrentCulture

    CurrentCultureIgnoreCase

      • TKH_TKH
      • 3 mths ago
      • Reported - view

       I used the suggested config, but it didn't help. Perhaps there is another solution?

      • Senior Director of Product
      • Ian_Macdonald
      • 3 mths ago
      • Reported - view

       

      Did you restart the IMDB Server after changing the .ini file?

      Ian

      • TKH_TKH
      • 3 mths ago
      • Reported - view

       Yeah, I've been reloading IMDB after changing the .ini file.

      • TKH_TKH
      • 3 mths ago
      • Reported - view

       screenshot:

      • Senior Director of Product
      • Ian_Macdonald
      • 3 mths ago
      • Reported - view

       reloading or restarting?

      • TKH_TKH
      • 3 mths ago
      • Reported - view

       Yes, I restarted the IMDB server after changing the .ini file

    • Senior Director of Product
    • Ian_Macdonald
    • 3 mths ago
    • Reported - view

    Hi 

    Also, remember that the Pyramid semantic model is a multidimensional model and is using the Strings column to create a DIMENSION (or Attribute if you prefer) in its dimensional model, which will be a list of unique values from that column. In the vast majority of cases, it is desired that case is ignored as data is often a little "dirty". Thus you would not want a dimension of customer names to differentiate between column values of Joe Smith and joe smith.

    When you place the customer name dimension on a grid in Discover, it lists the unique values of the underlying column as dimension MEMBERS, not the individual values from each record.

    Hope that helps.

    Ian

      • TKH_TKH
      • 3 wk ago
      • Reported - view

       Thank you for your response. In our specific case, we actually need to differentiate between strings that differ only by case (e.g., "Test" and "test") because they represent distinct entities in our data. The current behavior, where these strings are treated as the same value in the IMDB, is causing issues with data accuracy and aggregation in our reports. Is there a way to configure Pyramid to handle case sensitivity for dimensions or attributes in this scenario?

      • Senior Director of Product
      • Ian_Macdonald
      • 3 wk ago
      • Reported - view

       It is the underlying database that defines how Pyramid's queries are interpreted. For example, when populating a slicer, you can configure if the Pyramid query uses Group By or Distinct to generate the unique list of values that populate a dimension with members. How the DB responds, differentiating between upper and lower case or not, is configured in the database as we discussed with IMDB above. There isn't a setting in Pyramid.

      If IMDB is not respecting settings in the config or .ini file, then please raise a ticket.

      Hope that helps.

      Ian

Content aside

  • Status Answered
  • 3 wk agoLast active
  • 10Replies
  • 109Views
  • 3 Following