Query Sampling Journals

One of the other projects I’ve been working on is sampling the corpus of the Journal of the American Chemical Society against the Stanford Encyclopedia of Philosophy (SEP).  I’ve been using some models set up by the INPHO project here at Indiana.  The hope here is to see whether between 1879 and 1922 (the years I can analyze right now), the journal represents the philosophy of chemistry according to what the SEP thinks should be going on.  Here is a quick spreadsheet of the results.  This is organized year by year, and the articles are ordered in how close the query sampling matches to articles in the SEP (i.e. article 1 is closest, article 2 is next closest, and article 10 is relatively far away).  Overall, not surprisingly, the number one result is Chemistry.  That means at least that the Query samples are picking up on the main topic of the journal.  What is more interesting is the articles that follow.  There are not really any patterns that stand out, and, in some cases, like Duhem is the second closest article in 1889 and a bit further down the list in 1891. Duhem is mentioned by name 3 times in one article in 1891 but not at all in 1889.  To me, this seems to be representing two things.  First, there is some bias in the SEP toward particular topics.  Second, at least for earlier periods, standard encyclopedias like the SEP may not be the best sources for how the field is progressing. I’m still thinking more about that second argument, so more on that later.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s