and yes, even though it helps "date" me, i am still a fan of Prince's 1999.
so it's time to leave your watch & wallet at home & wrap up your money & mobile phone in zip-locked baggies. here comes the water :-)
btw you too can help celebrate this holiday, grab a water pistol or bucket of water & dowse everybody in sight ;-)
if you're interested in using icu4j's new AcceptLanguage method, you'll need to wrapper it. this method makes use of an 'out-parameter' method to return a boolean as to whether the method used a fallback locale (ie. it couldn't find a suitable locale among the server's installed locales, so it returns a fallback locale instead). coldfusion won't pick up on that returned boolean array. below find some java code for this (it returns a structure with the selected locale and whether or not it was a fallback locale):
import com.ibm.icu.util.ULocale;
public class ULocaleAcceptLanguage {
/*
class: ULocaleAcceptLanguage
version: 15-jul-2005
author: Paul Hastings paul@sustainableGIS.com
notes: simple wrapper class for ICU4J acceptLanguage
*/
public final static HashMap getULocale(String httpAcceptLanguage){
HashMap results = new HashMap();
boolean[] fallback = new boolean[1];
ULocale thisLocale = ULocale.acceptLanguage(httpAcceptLanguage,fallback);
Boolean fallB= new Boolean(fallback[0]);
results.put("locale",thisLocale.toString());
results.put("fallback",fallB.toString());
return results;
}
}
compile this and drop it in your cfinstall classes dir. you can then make use of it:
aL=createObject("java","ULocaleAcceptLanguage");
acceptLanguageStr="en-us,th;q=0.7,ar;q=0.3";
uL=al.GetULocale(acceptLanguageStr);
</cfscript>
<cfdump var="#uL#">
- updated to Unicode 4.1
- collation engine updated to UCA 4.1
- fully conformant with CLDR 1.3
- charset detection framework (which looks very useful)
- message formatting apostophe solution
- additional usability APIs
- new currency listing API
- more API for accessing CLDR data
- Coptic and Ethiopic calendars (that makes 8 icu4j calendars and Dr. Ghasem Kiani's persian calendar for a total of 9, count 'em 9, calendars)
- more efficient data loading
and in case you were wondering, today (2-jul-2005) is October 25, 1721 in the Coptic calendar and October 25, 7497 (Amete Alem Era) in the Ethiopic calendar system.
which brings us to the point of this blog entry, this method expects the year argument to be a persian calendar "year" (right now its 1383 in the persian calendar). which i didn't quite grasp at first, as the other calendars (gregorian, buddhist and japanese) with leap years have an isLeapYear method that expects a gregorian year (yes, even the buddhist and japanese calendar classes expect a gregorian year, i imagine this is because these calendars extend the gregorian calendar class). and that's the way i expected the new persian calendar to behave (my own cultural bias--i use the buddhist and gregorian calendars on a daily basis). but it doesn't and why the heck would it? it is a persian calendar after all. so that got me to thinking about the other calendars and the way these "should" work and what other cultural biases have leaked into our code and test harnesses--especially the tests.
first thing i did was to rewrite the i18nIsLeapYear functions across all the calendars to expect a year argument in that calendar's system (it converts to gregorian year as needed and now automagically returns false for calendars lacking the concept of a "leap year").
then i went a hunting for any other places where my cultural bias might have leaked thru....and promptly found it in the getYear function. the getYear function takes a gregorian year value and returns the year in that calendar's system. i was doing that by creating a date:
(and just in case you were wondering, the 2 for the day value is to make sure the date value created fell into that year, given that we're using UTC as the time zone standard for all the calendars). and then setting the calendar object to that date and returning the value for that calendar object's YEAR field:
return tCalendar.get(tCalendar.YEAR);
simple and worked swell for the gregorian, buddhist and japanese calendars because these calendars' year started at the same time. but after looking at the year values of formatted dates from the other calendars i realized that the getYear function was returning horrible nonsense for the other 4 calendars. without realizing it, i'd let my calendar bias creep in and assumed the calendar's were all the same as far as years were concerned. gregorian 2-jan actually falls into different calendar years depending on the calendar (of course, they're different freaking calendars). and the tests were only reporting whether the getYear function "worked" by checking if the year was a positive integer, no eyeball comparisons against the year bits of the formatted date strings. there's a lesson here some where.
so better grab the new code and maybe give the calendars a good poking at to make sure no other cultural bias is left in it.
note that this version of the persian calendar uses a "well-known arithmetic algorithm for calculating the leap years" rather than astronomical calculations.
i'd like to publicly thank Dr. Ghasem Kiani for his work on this project, we've been waiting quite a while for a persian calendar to round off our i18n calendars. thanks.
a lunar calendar was used in japan from the 14th to the 19th century. that calendar had a six day week and those six days were known as rokuyo. and like any other calendar system, each day had a name and a particular meaning (you do know that the english weekdays are named after one of the seven "planets" of ancient times?). and of course, each day had a significance:
- sakigachi good luck in the morning, bad luck in the afternoon
- tomobiki good luck all day, except at noon
- sakimake bad luck in the morning, good luck in the afternoon
- butsumetsu Unlucky all day, as it is the day Buddha died
- taian 'the day of great peace', a good day for ceremonies
- shakku bad luck all day, except at noon
while i'd guess few people would admit to closely adhering to this system, it does invoke some strange "better safe than sorry" behaviors. for instance, some hospital patients in japan won't agree to be discharged on butsumetsu day, as it's regarded as being very unlucky. rather they'd stay the extra 24 hours to be discharged on a lucky taian day.
the calculations for determining rokuyo turn out to be surprisingly difficult. in fact, the only published code i ever saw for this was developed by Eirik Rude, a cf developer (at that time living in japan). the complexity comes from the need to calculate lunar months (remember the old japanese calendar?). since i wanted to integrate this functionality with our existing icu4j-based calendars, i poked thru the lunar calendars (chinese, islamic and hebrew) that i knew about to see if we could use any of these. of course, the old japanese lunar calendar was basically the lunisolar chinese calendar. using Eirik's basic logic and the icu4j library i was able to considerably reduce the code's complexity (the complexity's still there, but i pushed it down into the icu4j java library where smarter people than i have already dealt with it).
the rokuyo testbed is here and the i18n calendars package incorporates this new functionality (pick japanese calendar from the select). and this is a good resource if you want to read more about rokuyo.
the code is also considerably improved, its now based on ICU4J version 3.2 and it's ULocale class (232 locales, 100 more than blackstone). several of the more commonly used functions have been re-written and we're seeing 3x-4x speed improvement over the older versions. frankly, i'm a bit baffled why, for instance:
following the ICU4J API and some examples, we initialized date formatting objects with the calendar class (Buddhist, Chinese, Gregorian, Hebrew, Islamic,Japanese) we were working with:
var thisCalendar=aCalendar.init(utcTZ,thisLocale);
// return formatted date
return aDateFormat.getDateInstance(thisCalendar,tDateFormat,
thisLocale).format(dateConvert("utc2local",arguments.thisDate));
was reworked into this:
var tDateFormatter=aDateFormat.getDateInstance(tDateFormat,thisLocale);
// swap calendars tDateFormatter.setCalendar(aCalendar.init(utcTZ,thisLocale));
return tDateFormatter.format(dateConvert("utc2local",arguments.thisDate));
this builds the date formatter object with the default calendar, then we swap it to the calendar we want to use (the tDateFormatter.setCalendar bit). that sped up this function 3x-4x! while it "seems" less efficient it actually worked quite a bit faster.
you can see the testbed and download the CFC package here. any comments appreciated.
it doesn't do much except format/convert gregorian dates to the persian calendar and back again (right now it can only parse medium/short persian date formats). still lacks calendar math, real persian date string parsing, arabic-hindic digits date formats, etc.
so what's a persian (or iranian) calendar? why it's the formal calendar in general use in iran, also known as the solar hijri calendar and sometimes as the jalali calendar. i've also seen it described as the shamsi calendar. frankly i have no idea which is correct so i'll stick with "persian". since it's one the few calendars designed in the era of accurate positional astronomy, it's probably the most accurate solar calendar around. you can read more here or here.
i've also been looking at this java calendar class. it has a boatload of calendars (besides persian it has mayan, nepali, hindu, coptic and believe it or not a french revolutionary calendar).
the traditional Chinese calendar is a lunisolar calendar (the same type as the Hebrew calendar). months start with a new moon, with each month numbered according to solar events. why? to guarantee that month # 11 will always contains the winter solstice. how? leap months are inserted in certain years (i feel another non-gregorian calendar induced headache coming on). these leap months are numbered the same as the month they follow. which month is a leap month? depends entirely on the movements of the sun and moon (i.e. i can't follow the math very far) . the normal ERA field differs from other calendars as it holds the 60 year "cycle" number, right now we're in the 78th cycle which began in 1983 AD. years are counted sequentially, numbering from the 61st year of the reign of Huang Di, 2637 BC, which is designated year 1 on the Chinese calendar (yes, that's right, this calendaring system is over 4,000 years old). let's look at an example:
星期三 20x78-9-13
where 20 is the year in the current cycle, 78 is the cycle for this calendar (ERA in other calendars), 9 is the month and 13 is the day.
since ICU4J's ChineseCalendar defines an additional field (for leap month) and redefines the way the ERA field (no longer AD,BC, etc.) is used, this CFC has to use a different date format class, ChineseDateFormat.
this CFC adds 4 generic functions (i forgot that some calendars need special date logic):
- isBefore to compare two dates to tell if one is before the other - isAfter which compares two dates to tell if one is after the other - getJulianDay returns the true Julian day for a given date - getExtendedYear returns the extended year, i.e. years since calendar start (in this case, current year + 2637) i'll retrofit these to the other non-gregorgian calendars. the date logic is probably more useful to the calendars that use calendar math different from the gregorian calendar (chinese, hebrew, islamic).
and 7 functions that are specific to this calendar (though i guess some can be applied to other calendars): - isLeapMonth determines if a given date is in a leap month - getCycle returns cycle for given date - getCycleYear returns year in cycle for given date - getMonth returns month in cycle year for given date - getDay returns day in month for give date - getDayOfYear returns day of cycle year for given date - getWeek returns week of cycle year for given date
the CFC's testbed is here. posted to the devnet gallery where i guess it will become available sooner or later.
next the astronomical calendar. this is one is quite tricky, its also somewhat in a state of flux (the ICU4J team's working on this code) but since it forms the basis of some of the existing calendars might as well give it a shot.

