A little exercise for fluiddb.el

While catching up on my reading I came across the blog post describing the people app by @paparent (see http://fluiddb.fluidinfo.com/about/paparentapps/paparent/peopleapp/index.html#/ for the app itself).

There is some tutorial on the app page itself (click the "guide" link) on how to add yourself so I thought this might be a good exercise for the FluidDB Emacs mode
Get the code from that repository, load fluiddb.el into your Emacs and then load this function:
(defun set-people-app-location (user-id password longitude latitude register-p)
  (let ((*fluiddb-credentials* (cons user-id password))
        (*fluiddb-server* "fluiddb.fluidinfo.com")
        (people-namespace (concat user-id "/people"))
        (user-about-text (concat "Object for the user named " user-id)))

    (when register-p
      (print "Creating the people namespace")
      (fluiddb-create-namespace user-id "people" "Namespace for value for the people-app; see http://fluiddb.fluidinfo.com/about/paparentapps/paparent/peopleapp/index.html#/")

      (loop for (ns . tag) in `((,user-id . "peopleapp")
                                (,people-namespace . "longitude")
                                (,people-namespace . "latitude"))
            do (progn 
                 (print (concat "Creating tag " ns "/" tag))
                 (fluiddb-create-tag ns tag "" nil))))

    (print "Setting coordinates")
    (fluiddb-set-object-about-tag-value user-about-text (concat people-namespace "/longitude") longitude)
    (fluiddb-set-object-about-tag-value user-about-text (concat people-namespace "/latitude") latitude)

    (when register-p(
      (print "Registering the user with the people-app")
      (fluiddb-set-object-about-tag-value "collection:peopleapp" (concat user-id "/peopleapp") nil))))

Now you can use this to a) create the needed tags and namespaces (by passing a true value for register-p) and b) set your current location.
Enjoy!

Comments (0)
Posted

fluiddb.el updated

Earlier this week I complete the update of cl-fluiddb the Common Lisp interface to FluidDB.  Today I am pleased to announce that the Emacs interface fluiddb.el has also been updated to support the new features introduced last month.

The new interface was described in the previous blog post — fluiddb.el follows that interface quite well, except for the generic function to format namespace and tag names; for the moment only strings are supported here.

This update affects only the low-level routines.  The interactive code doesn't make use of these features yet.

Happy hacking!

Comments (0)
Posted

An update to cl-fluiddb

About two weeks ago the Fluidinfo people release the long awaited update to the Fluiddb API.  You may want to read the blog posts: here or here or look at the API documentation itself.

As I maintain two libraries that aid in the use of this API, some work was required.  I started with cl-fluiddb the Common Lisp library.  (The emacs lisp library will come later and should track the changes made to cl-fluiddb described here).

Forgotten functions added
After reviewing the API docs it became clear that some functions were missing.  This is not new functionality, just some I never needed and hadn't even noticed was missing.  The two new functions are offering the HEAD and DELETE verbs for object tag values:
  • (object-tag-has-value-p id tag)
  • (delete-object-tag-value id tag)

New functions
The November update brought two new features: /values and /about.

/values
With /values you can perform operation on multiple objects at the same time.  
The new functions are:
  • (query-objects-tag-values query tags-list) — retrieve multiple tag values for all object matching the given query
  • (set-objects-tag-values query tags-values-list) — set multiple the tag values on all object matching the given query (tags-values-list is a list of (tag . value) conses)
  • (delete-objects-tag-values query tags-list) — delete the tag values on all object matching the given query
/about
With /about you can access an object via its about tag rather than its id.  This not only makes for readable URLs for fluiddb based web apps but can also simplify your code as you don't have to first find the id of a object before adding a tag etc.

The new functions are fairly straightforward.  E.g. instead of get-object which requires an id as a parameter you now also have get-object-about which instead take a string with the about tag.

The newly added functions for the /about functionality are:
  • (get-object-about about)
  • (get-object-about-tag-value about tag &key want-json accept)
  • (set-object-about-tag-value about tag content &optional content-type)
  • (object-about-tag-has-value-p about tag)
  • (delete-object-about-tag-value about tag)

Interface changes for better consistency
While doing all these additions it also occurred to me that the cl-fluiddb API was inconsistent in its treatment of tag names.  Some functions expected two arguments (a namespace and a tag name) while others only one (a string with namespace and tag name).  So in this revamp I made a backward incompatible change to only use the single argument way of specifying a tag. (The one exception remain create-tag and create-namespace which actually put the name of the tag or namespace to be created in the request body not the URL.)

Format of namespaces and tags
Now that all functions have a unified way of passing namespaces and tags, there is also a generalised way how these can be expressed.  cl-fluiddb now exposes the new (url-format-namespace-or-tag data) generalised function which you can add a method to in order to format your choice of name representation into a string that will be used in the path part of the URL (i.e. 'funny' characters have to be properly URL escaped in the returned string).  Once defined you can pass that type as a namespace or tag parameter to all functions.  Currently there are methods for strings and lists (of strings).  So now you can e.g. say either (get-object-about-tag-value "isbn:xyz-xyz..." "hdurer/general/rating") or (get-object-about-tag-value "isbn:xyz-xyz..." '("hdurer" "general" "rating")).

All these changes are now in the cl-fluiddb github repository

Comments (2)
Posted

Web crawler stupidity

From my access logs:

65.55.3.134 - - [15/Oct/2010:09:49:13 +0000] "GET /robots.txt HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:14 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:16 +0000] "GET /robots.txt HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:17 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:20 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:22 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:25 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:28 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:31 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [15/Oct/2010:09:49:33 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"

65.55.3.134 - - [20/Oct/2010:17:04:13 +0000] "GET /robots.txt HTTP/1.1" 200 76 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:04:13 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:15 +0000] "GET /robots.txt HTTP/1.1" 200 76 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:17 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:19 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:22 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:28 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:30 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:33 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [20/Oct/2010:17:05:38 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"

65.55.3.134 - - [25/Oct/2010:19:22:24 +0000] "GET /robots.txt HTTP/1.1" 200 76 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:25 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:27 +0000] "GET /robots.txt HTTP/1.1" 200 76 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:28 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:31 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:34 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:37 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:39 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:42 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"
65.55.3.134 - - [25/Oct/2010:19:22:44 +0000] "GET / HTTP/1.1" 404 270 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)._"

I don't know why they think they need to try eight times within about a quarter of a minute to fetch a page that they are told every time doesn't exist.
Plus, what do they think will have changed in the robots.txt in the two seconds since they last checked?

What a waste of their and my resources...

Comment (1)
Posted

Doing your own yubnub clone

I have been using yubnub as my main search engine for ages now but have always been slightly unhappy that some mistyped command went off to some weird site and other commands are so inconveniently long (ddgo for duckduckgo.com is the worst offender).

So this week I finally took some commuting time to implement my personal version. It's really not very difficult or amazing as all I want is a simple redirect to different search URLs: get the request, parse out the first token and dispatch based on that.

What took me almost the longest time was getting the search engine nicely installed. Apparently my google foo is quite weak as I couldn't easily find a complete documentation on how to format the OpenSearch specification and how to install it. Actually, it would have been faster if I had had my speakers on and noticed the warning sound from NoScript that javascript was disabled on my page causing weird behaviour (asking for where to save the search description rather than asking if I want to install that search).  Note to self: Trust your own web site (at least in NoScript).

Comment (1)
Posted

Password policies and sign-up interfaces

Today I came across another stupid site with confusing password interface/policy.

When signing up for an access token to a "Boris Bike"  have to provide a password for your new account (OpenID is a good idea but unheard of outside tech-savvy sites).  The web site helpfully gives you a hint "Minimum of 8 characters. Your password should contain upper and lower case letters and at least one number."  No mention of special characters or an enforcement, but hey it's just a token ey?  Of course, when you ignore their suggestion you'll get an error stating that in fact you do have to use upper and lower case characters and digits.  Duh.  Web designers and programmers not agreeing on policy?

Of course that's still miles better than Lloyds-TSB's "Visa secured" (or whatever it's called) system.  For starters it doesn't even allow anything other than letters and digits in passwords (they don't tell you of course waiting for you to try so they can spring an error message on you) nor do they tell you any requirements (until you get an error message telling you what you have done wrong).  For the purpose of this post we'll even forget that to sign up all you had do know was some hardly private information about me or that these security screens from my bank on vendors' web sites are just the best training to teach people to be phished...

Rant over.

Comment (1)
Posted

MongoUK

 

Last week I had the good fortune to attend the one-day MongoUK mongoDB conference.

There were many good talks although I had hoped for a bit less introduction and more in-depth talks, but there was still a lot of interesting stuff that it was well worth spending the day and the US$50.  I especially liked the in-depth talk about how sharding is implemented (not that we'll need it for our use but it was intellectually stimulating).  The whole thing somewhat reinvigorate my interest in mongoDB at least in so far that I spent some time to get it running on my VPS and our DB servers at work (turns out both are old enough to require the statically linked legacy binaries — I could swear I tried that last time without any luck but at least this time it worked.

Now to find some time to actually do work with it...

Comments (0)
Posted

Holding on to memory

This week I spent far too much of my spare time trying to track down where on earth I was "leaking" memory in my trivial clojure code.

I was trying to load the geoplanet data into MongoDB and all that using Clojure of course in my quest to learn that language.

Parsing was fairly easy — a neat sequence of reading the file line by line, splitting it up, cleaning up the columns, parsing integers, etc. Thanks to lazy sequences and the ->> macro this all looks almost trivial:

(defn load-data-file
  "General (for any geoplanet data file) loading file function"
  ([file-name]
     ;; load without any mappings
     (load-data-file file-name []))
  ([file-name mappings]
  (binding [duck-streams/*default-encoding* "UTF-8"]
    (->> file-name
         duck-streams/read-lines
         (map (fn [line] (str-utils2/split line #"\t")))
         (map (fn [line] (map maybe-strip line)))
         (parse-data-file file-name)
         (map (fn [dict] (reduce (fn [dict mapper] (mapper dict)) dict mappings)))))))

There are some extra functions wrapping these to read the three types of files (places, adjacencies, and aliases) but that is irrelevant for the discussion here.

Writing data is also trivial thanks to congomongo. We partition the data we read into larger batches to push into the DB and also give some visual feedback as this will take ages (ignore the munge-place which only adds some extra fields I want):

(defn store-places
  [places-list collection-name]
  (doseq [some-places (partition 1000 1000 [] (map munge-place places-list))]
    (congo/mass-insert! collection-name some-places)
    (print "."))))

However, in combination (store-places (load-places-file "7.4.1/uk_places.tsv" false) "uk") this soon blew up with an exhausted heap exception. However the equivalent

(doseq [x (partition 1000 1000 [] 
                     (map munge-place-to-dict (load-places-file "7.4.1/uk_places.tsv" false)))]
  (congo/mass-insert! "uk" x)
  (print "."))

does work.

I guess the output of load-places-file is a lazy sequence and the store-places function holds onto the head of that sequence ( it is the places-list parameter) thus not allowing it to be garbage collected while we iterate over it. Took me ages to find this out. The first urge to factor your code into nice small bits is not always the best way to go.

 

Comments (0)
Posted

Use YQL to do multiple requests in one go

I have known for quite a while that YQL was the best thing since sliced bread but today I learned another reason why it is so great:  I was just looking at Christian Heilman's latest project, the geoplanet explorer (source code on github) and noticed reading the code that YQL can do multiple requests in one go. Just select from query.multi giving the semicolon separated list of actual queries in the queries selector.

This should speed up multiple requests (saving you having to establish multiple connections) especially if they are close to Yahoo!, such as doing requests to Yahoo! services.

Comments (0)
Posted

dev8d developer days - initial summary

So Friday was my third and last day at this year's London dev8d (http://dev8d.org/).  

Executive summary: It was great!!!

There was so much to see and do and learn that I am still buzzing with excitement and at the same time I am sad when I consider all the things I did not get to see, projects not finished, people not talked to.

The format of the event was somewhat unconventional to me.  As with conferences there are many tracks but most of them are not presentations but active projects that involve you — coding Dojos, workshops, projects.  In addition there were ten "challenges" that asked the participant to produce something by Saturday with the chance to win prizes.  Only one track was "expert talks" which contained lots of interesting things, but the track was a bit lost in the many other interesting things, so I missed them all except for the reprap talk.

Maybe I spread myself too thin by doing too many different activities; I nowhere nearly finished any project (an interesting modification of the prepared genetic programming code; find some spurious correlation in data downloaded from data.gov.uk; learn enough about Scala to not be a complete fool; ...). On the other hand I got a lot of interesting leads that I will have to follow up on in the next few weeks.  Definitely the new knowledge of Sparql and how to find data in data.gov.uk will come in handy for work.

I can only hope that there will be many more events like there in the future that I will be able to attend.  A big thanks to all sponsors of dev8d, the organisers who did a fab job and the presenters who invested a lot of time preparing things for us.

Comments (0)
Posted