We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.

Vũ Thị Bích Thư • 1 year ago

Thank for your helpful sharing.
Your explanation is very detailed and easy-to-understand.
Btw, I also research about some related topics and I found your post.
About this point:

Các ký tự dấu rời nằm trong một dãy Unicode biết trước có tên Combining Diacritical Marks (\u0300-\u036f) nên chúng ta dễ dàng xóa chúng một cách tổng quát.

I found other alternative approach mentioned here and I asked ChatGPT to compare some approaches each other below:

1. Using [\u0300-\u036f]: This pattern matches a specific range of diacritical marks in the Basic Latin block. It is more limited and doesn’t cover all possible diacritical marks.

2. Using [\p{M}]: This matches any mark character, including diacritical marks. It’s broader and covers more mark characters than the specific range pattern.

3. Using \p{Diacritic}: This is a more targeted approach for matching diacritical marks specifically, but it requires Unicode property escapes, which are not supported in all environments. It’s a clear and concise way to match diacritical marks if your environment supports Unicode property escapes.


I've just researched and wanted to comment to say thank to you also share what I learnt to remember what I read from your post and what I researched.
Let me know if any discussions or feel free to correct my knowledge

Many thanks.