Topics of Three American Scientific Journals between 1819 and 1922
After doing my first analysis of American scientific journals during the nineteenth century, I did some more topic modeling of my dataset of three journals (the American Journal of Science, or AJS from 1819 – 1922, the Proceedings of the American Association for the Advancement of Science, or PAAAS from 1848 – 1915, and the Journal of the American Chemical Society, or JACS from 1879 – 1922) For this analysis, a topic model was run on the entire corpus, rather than just one set of journals. Each of the 209 text files were split into documents of 1,000 words and a 500 topic model was created to investigate the corpus as a whole (see quick visualization above).
Unsurprisingly, chemistry comprises 32% of the topics; this is most likely because JACS is such a large part of the dataset. Similarly, geology and mineralogy are the next most significant number of topics at 14%, representing much of the subject matter of AJS. Additionally, some subject areas such as zoology, botany, and meteorology are more prominent in this topic model than in the other models. There are two somewhat surprising results, however, the seemingly large amount of topics relating to theory and method and business. Both of these categories might help to indicate trends in professionalization. Given the trends from the previous analyses, it is important to ask in which these theoretical and business topics appear and during what time period. Presumably, if the previous analyses were correct, then the answers would be that theoretical topics are more prominent in PAAAS and that both of these topics should increase during the late 1870s.
One topic that typifies the kind of topic for theory and method can be found in a topic with the words: term, terms, general, called, case, true, present, sense, relations, defined, definition, relation, form, definite, considered, question, paper, applied, expressed, and conditions. This is a topic with general words about science and how it is performed. For this topic, the largest number of words assigned to that topic (45) comes from PAAAS in 1876. A similar topic has the words: factor, influence, conditions, factors, important, direct effect relative relation determining control role present combined importance influences, indirect, data, favorable, and normal. The top related documents to this topic also come from PAAAS in the 1890s. Thus it seems that as with the previous topic model analysis, PAAAS discusses more theoretical topics, and these topics cluster during the later part of the nineteenth century.
With regard to business topics, especially as they pertain to the ACS, a topic containing the words society, chemical, american, journal, chemistry, chemists, members, editor, year, abstracts, papers, industrial, meeting, council, address, york, number, publication, proceedings, and directors is a topic discussing meetings of ACS, and also addresses their publication concerns. This topic appears only in the JACS documents, and the earliest date in which this topic appears is 1893, around the same time as the “unexpected” and “expected” topic divide happens from the previous analysis of JACS. A topic wit the words committee, report, secretary, congress, appointed, international, members, council, meeting, year, association, account, library, fund, chairman, treasurer, adopted, committees, received, and society is a more general topic also referring to meetings, appears most frequently in PAAAS. The earliest PAAAS issue for this topic is 1874, and most of the documents assigned to this topic come from the 1890s which would again be consistent with previous analyses.
It is also worth noting that there were 85 topics that were discarded for this analysis. Most of these topics appear to be random distributions of numbers (such as topic with a series of numbers including 000, 1, 500, 100, 400, 200, 10, 300, 600, 800, 700, total, 20, 250, 50, 3, 30, 900, 150, and 750). These seem to represent page numbers or indexes that are sometimes present at the back of volumes. Considering that the overall topic model contained 500 topics, this is a relatively small number of topics (17%) that were discarded for this analysis.
Overall, it would seem that the findings for this second analysis are consistent with those of the previous analyses. PAAAS does seem to be more involved in discussing theoretical topics, and business topics appear to be more frequently discussed in the later part of the nineteenth century. Most importantly JACS at least becomes more involved in discussion of business in the 1890s, the same period when the ACS is consolidating into a professional society with a clear identity separate from the AAAS. This analysis does, however, seem to add some nuance to the overall picture of science, however. Why do subjects such as astronomy and meteorology, subjects that were often so small in the other analyses that they were lumped together as “other sciences” appear relatively prominently here? Additionally, since physics was the next discipline to form a professional society, when did it become so prominent? The other topic models seem to show physics as equal to chemistry. Some of the answers to these questions may be matters of statistical significance, this analysis over the entire corpus is more consistent with the assumptions of Mallet.Nonetheless, both analyses seem to consistently point to the same general conclusions that are present in past topic modeling research on this dataset.