robsmart.co.uk

Emerging technology, Open Source and the Internet
  • rss
  • Home
  • About
  • Press

Transliteration again … now complete

Rob Smart | February 2, 2007

I’ve spent a bit more time on my arabic to english transliteration, it now does English to Arabic as well. So put your name into the text area below hit the “To Arabic” button and see what your name looks like in Arabic characters. Also have a go at pasting some Arabic text in to find out how those characters are pronounced (well…roughly pronounced ;) ) you have to be a bit creative and imagine that there are a few more vowels between the letters when doing the Arabic to English (see my lower transliteration post below for an explanation why).

(Please do not do anything permanent with the output of this, it is meant as an experiment. There will be errors in the output so don’t get a tattoo with anything that comes out of it.)


Comments
83 Comments »
Categories
Tags
arabic, transliteration
Comments rss Comments rss
Trackback Trackback

Arabic to English transliteration

Rob Smart | February 1, 2007

So as part of my ongoing self education ;) I’m attempting to learn the Arabic language, for good reason as Hanan my wife is half Syrian, half English and speaks fluent Arabic. I’m also going to be visiting both Saudi and Syria hopefully this year so I now have the kick up the back side I need to try and learn Arabic a little bit more formally.

An important part of learning Arabic and something I’ve avoided so far is learning the alphabet and writing system. I’m still in the early stages of this at the moment and any exposure to the alphabet helps.

As a little excercise i’ve decided to write a bit of PHP script that takes a string of arabic characters and transliterates them into the latin alphabet. Now what this basically equates to is a big ole switch statement checking the double byte value for each arabic character and then outputing the appropriate latin characters.

Sounds pretty easy ? Well it is really, but it has some drawbacks. It’s not really possible to get an end result that’s deadly accurate using standard internet typed Arabic. The reason for this is that the short vowels in Arabic are represented by using diacritics such as ยด above and below the characters these denote respectively whether there is to be an “a” sound or an “i” sound after the particular letter. (Oh and there are also 2 or 3 sounds that don’t exist in English but that’s another problem ;) )

This is all fine, but the problem is these diacritics are usually left out of typed Arabic except to distinguish between two ambiguous words. So if the information is not there to start with we can’t 100% accurately work out the sound. However we can do an approximation and for most of the time this is good enough :)

Here’s a little sample sentence before and after transliteration.

???? ?????? ??????? “??? ???????????? ???????” ???? ?? ??? ????? ????? ??? ???????? ?? ???? ?????? ????????? ???? ?????.

comes out as …

qalt mjmoAa alshrkat ‘dbi antrnashional kabital’ anha lm tAd mhtma bshraa HSa alaghlbia fi nadi lfrbol alanjlizi lkra alqdm.

you can even start to spot some familiar words in there ‘dbi antrnashional kabital’ = Dubai International Capital (there is no p sound in arabic so they come out as b’s )

I’ll have more of a play and see if i can improve it, but the output is good enough for Hanan to figure out the output without seeing the original (given a bit of head scratching admittedly). However its a good way for me to get a quick gist of how some arabic script should sound while i’m learning the alphabet.

Comments
8 Comments »
Categories
Tags
arabic, transliteration
Comments rss Comments rss
Trackback Trackback

Tags

3d aberdeen anime api arabic art avatar c# drawing environment ets ets hursley ibm press flickr google hursley ibm ibm hursley ets rockets filming photography image processing jeddah lsl machinima openid opensim pervasive photography press saudi scripting secondlife security simile sony home streaming timeline transliteration twister video virtualworlds virtualworlds secondlife ibm linden wacom wacom concept art deviantart graphics wave power weather web 2.0 webmaster

  • Andy Piper
  • Dale Lane
  • Darren Shaw
  • Eightbar
  • Feeding Edge
  • Gareth Jones
  • Hannah Parker
  • Ian Hughes
  • Ian Smith
  • Irving Wladawsky-Berger
  • James Taylor
  • John Tolva
  • Kelly Drahzal
  • Martin Gale
  • Michael Rowe
  • Michael Rowe
  • Mo Hax
  • Nick O’Leary
  • Penny Glazzard – Zzing marketing
  • Pranab Sharma
  • Rita J. King
  • Roo Reynolds
  • Rosemary Gardening

  • Photography

Recent Comments

  • surveyscout para kazanma on Transliteration again … now complete
  • Patricia on Transliteration again … now complete
  • Ozeki VoIP SIP SDK on Home VOIP system using FreeSwitch and a Linksys 3102 voice gateway (UK Guide)
  • » Favorite links of 2007 Ascent Stage on Your Flickr timeline with Simile
  • pat on Transliteration again … now complete
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox