tags: Xerces

Action executed in 0.000

Each Tag


Common tags - number of posts

scrape (1), guitar (1), database (1), graph (1), XML (1), traffic (1), CHART (1), truncate (1), Marc (1), JpGraph (1), articles (1), speed (1), TrickleUp (1),

Graphing Traffic Speed Data

For the past few days i've been working on a new project. You see, i can't stand the traffic and i'm trying to avoid it. In order to do that i need to know when it's the worst. Thankfully the state of Maryland measures the speed of traffic on major roads. I take that data and graph it. Details follow.

My source is a web page that is hardly worth calling HTML. It's probably version 3.2, maybe be 4.0; in either case, (1) not easily parsable and (2) not semanticly marked up. So i wrote a script to convert it into XHTML 1.0.

Now it becomes easy to use Xerces, and just pluck the data i need. It goes directly into a database. Where are my calculations? On the back of an envelop in the trash, of course. Reading my source every 5 minutes, i'm collecting 400 kilobytes per day.

With it in a database i can create fancy views any which way i want. The next views to create are weekday versus weekend traffic, followed by school season versus non school season.

Finally i wrote a PHP script to generate graphs using JpGraph. It doesn't look great now because i haven't collected data for the averages to appear as averages.

If anyone can find similar data for Virginia, DC, let me know. The CHART website has a map to display the locations of the sensors. If anyone has suggestion or comments, just speak up.

Guitar Test, Truncated XML

After a short workout, i went with Marc directly from work to Guitar Center. We went to the back room were we tested a varied selection of Martins and Taylors.

The Taylors have a tinny and bright sound whereas the Martins have a mellow and full bodied sound. After hearing it, i'm now thinking about the D-28; after all this guitar will last me a lifetime.

I didn't work on the comment feature on account of this dilemma. I want to provide truncated versions of articles on the top page, but don't know how to truncate XML without invalidating it. I do have one idea. Basically read and parse the content maintaining a stack of open tags. Count the number of characters printed. When you hit the limit, close all the open tags in reverse order.

PHP and Java have nice libraries to parse XML. I just prefer Xerces.