Handling HTML Entities and Unicode
data:image/s3,"s3://crabby-images/92657/92657cb2d291a89d46e86bf9aeec36011b002e98" alt="A robot hand using a paint scraper to remove paint from a wall."
What if your text data is contaminated with Unicode characters and HTML entities? Ideally you want your persisted data to be pristine. Metaphorically it should be prêt à manger (ready to eat). In principle I also want my text to be as simple as possible: ASCII characters, nothing else. This is sometimes achievable without the loss of too much information.
Read More →