How to get a character from a codepoint in Spark SQL
prequel.co·7h·
Discuss: Hacker News
Flag this post

At Prequel, we are obsessed with data integrity. It’s our whole thing: we move data between various systems, and our value prop is that we’ll do it fast and never miss a row. To ensure that’s the case, we’ve built an in-house data integrity capability that allows us to identify discrepancies between tables in different database systems. We use this in our tests and even expose it as a feature for our customers to leverage. This obsession means we’ve spent a lot of time testing the weird edge cases of data transfer between different systems. And let me tell you, the world of data systems is full of… opinions.

For example, we learned the hard way that you absolutely must test the full range of Unicode characters. Not all systems (or their drivers) are created equal. Postgres, …

Similar Posts

Loading similar posts...