So as part of my ongoing self education
I’m attempting to learn the Arabic language, for good reason as Hanan my wife is half Syrian, half English and speaks fluent Arabic. I’m also going to be visiting both Saudi and Syria hopefully this year so I now have the kick up the back side I need to try and learn Arabic a little bit more formally.
An important part of learning Arabic and something I’ve avoided so far is learning the alphabet and writing system. I’m still in the early stages of this at the moment and any exposure to the alphabet helps.
As a little excercise i’ve decided to write a bit of PHP script that takes a string of arabic characters and transliterates them into the latin alphabet. Now what this basically equates to is a big ole switch statement checking the double byte value for each arabic character and then outputing the appropriate latin characters.
Sounds pretty easy ? Well it is really, but it has some drawbacks. It’s not really possible to get an end result that’s deadly accurate using standard internet typed Arabic. The reason for this is that the short vowels in Arabic are represented by using diacritics such as ´ above and below the characters these denote respectively whether there is to be an “a” sound or an “i” sound after the particular letter. (Oh and there are also 2 or 3 sounds that don’t exist in English but that’s another problem
)
This is all fine, but the problem is these diacritics are usually left out of typed Arabic except to distinguish between two ambiguous words. So if the information is not there to start with we can’t 100% accurately work out the sound. However we can do an approximation and for most of the time this is good enough
Here’s a little sample sentence before and after transliteration.
قالت مجموعة الشركات “دبي انترناشيونال كابيتال” انها لم تعد مهتمة بشراء حصة الاغلبية في نادي لفربول الانجليزي لكرة القدم.
comes out as …
qalt mjmoAa alshrkat ‘dbi antrnashional kabital’ anha lm tAd mhtma bshraa HSa alaghlbia fi nadi lfrbol alanjlizi lkra alqdm.
you can even start to spot some familiar words in there ‘dbi antrnashional kabital’ = Dubai International Capital (there is no p sound in arabic so they come out as b’s )
I’ll have more of a play and see if i can improve it, but the output is good enough for Hanan to figure out the output without seeing the original (given a bit of head scratching admittedly). However its a good way for me to get a quick gist of how some arabic script should sound while i’m learning the alphabet.
Recent Comments