0

Incorrect handling of strings with different cases in in-memory

When loading data into IMDB, there is an issue with handling strings with different cases. If you create a table with identical strings but with different cases (e.g., 'Test' and 'test'), and then load it into the model, in Discovery, strings with lowercase letters, such as 'test' and 'test1', do not appear.

8 replies

null
    • Daniel_Chitlaru
    • 2 mths ago
    • Reported - view

    String comparison in IMDB is case-insensitive by default. ETL and previews that iterate over existing values will return all values as they were inserted, but discovery queries with GROUP BY etc will cause it to merge values that differ only by case.

    Currently the only way to change it is by editing irdb.ini (located in Pyramid installation directory/imdb)

    Add a line: collation=X where X can be one of the following:

    OrdinalIgnoreCase   (Default)

    Ordinal (This is probably what you want)

    InvariantCulture

    InvariantCultureIgnoreCase

    CurrentCulture

    CurrentCultureIgnoreCase

      • TKH_TKH
      • 2 mths ago
      • Reported - view

       I used the suggested config, but it didn't help. Perhaps there is another solution?

      • Senior Director of Product
      • Ian_Macdonald
      • 2 mths ago
      • Reported - view

       

      Did you restart the IMDB Server after changing the .ini file?

      Ian

      • TKH_TKH
      • 2 mths ago
      • Reported - view

       Yeah, I've been reloading IMDB after changing the .ini file.

      • TKH_TKH
      • 2 mths ago
      • Reported - view

       screenshot:

      • Senior Director of Product
      • Ian_Macdonald
      • 2 mths ago
      • Reported - view

       reloading or restarting?

      • TKH_TKH
      • 2 mths ago
      • Reported - view

       Yes, I restarted the IMDB server after changing the .ini file

    • Senior Director of Product
    • Ian_Macdonald
    • 2 mths ago
    • Reported - view

    Hi 

    Also, remember that the Pyramid semantic model is a multidimensional model and is using the Strings column to create a DIMENSION (or Attribute if you prefer) in its dimensional model, which will be a list of unique values from that column. In the vast majority of cases, it is desired that case is ignored as data is often a little "dirty". Thus you would not want a dimension of customer names to differentiate between column values of Joe Smith and joe smith.

    When you place the customer name dimension on a grid in Discover, it lists the unique values of the underlying column as dimension MEMBERS, not the individual values from each record.

    Hope that helps.

    Ian

Content aside

  • Status Answered
  • 2 mths agoLast active
  • 8Replies
  • 88Views
  • 3 Following