Tutorial: Generating and Visualizing Topic Models with Tethne and MALLET

Tethne provides a variety of methods for working with text corpora and the output of modeling tools like MALLET. This tutorial focuses on parsing, modeling, and visualizing a Latent Dirichlet Allocation topic model, using data from the JSTOR Data-for-Research portal.

In this tutorial, we will use Tethne to prepare a JSTOR DfR corpus for topic modeling in MALLET, and then use the results to generate a semantic network like the one shown above.

In this visualization, words are connected if they are associated with the same topic; the heavier the edge, the more strongly those words are associated with that topic. Each topic is represented by a different color. The size of each word indicates the structural importance (betweenness centrality) of that word in the semantic network.

This tutorial was developed for the course Introduction to Digital & Computational Methods in the Humanities (HPS), created and taught by Julia Damerow and Erick Peirson.

Tethne: Geographic Networks in Gephi and Google Earth

Many bibliographic datasets include institutional affiliations for authors. Using geocoding services, such as the Google Geocoding API, we can convert institution names and addresses into geographic coordinates that can be plotted on a map. Tethne provides geocoding services in the services.geocode module.

In this tutorial, we will use the Google Geocoding service to obtain geographic coordinates for authors in a coauthorship network (see Coauthorship Networks) and its derivative, the institutions network (see networks.authors.institutions()). We will then plot those geo-coded networks in Gephi using the Geo Layout plugin, and overlay them on a 3D map of the globe in Google Earth.

This tutorial was developed for the course Introduction to Digital & Computational Methods in the Humanities (HPS), created and taught by Julia Damerow and Erick Peirson.

Click here for the full tutorial.

Python Class for Accessing Custom DSpace API

By Erick Peirson

The Genecology Project, and other computational HPS projects at ASU, rely on the ASU Digital HPS Community Repository for storing a variety of data and products. An important part of the development process for the repository was producing a REST API that could provide reliable and programmatic access to deposited materials. Since the built-in DSpace REST API didn't quite have all of the functions that we needed (e.g. it didn't have a reliable authentication mechanism, and it ignored access restrictions), we collaborated with programmers at the Marine Biological Laboratory to develop a new API (available here). Since the analysis process in most of our computational projects takes place in a Python environment, we had to develop some simple methods for pulling material from the repository via the API.

A Python class for interacting with our custom DSpace API is available here, and documentation is available here. This is a work in progress, but may be useful for others who adopt similar infrastructure for their digital projects, or who wish to interact with the ASU Digital HPS Community Repository.

History of the Max Planck Society — Department Baldwin - Introgression in Co-authorship Networks

By Erick Peirson.

An important component of analyzing the causes and dynamics of conceptual change in science is understanding the behavior and influence of individual scientists, in the context of their collaborations and discursive activity. Fleck's concept of Denkkollectiv drew attention to the ways in which patterns of collaboration give rise to specialized Denkstil -- patterns of thought, language, and practice that constitute the lens through which scientists see the natural world and ask questions about it. Consistent with our everyday experience in social situations, Social Network Analysis has shown how power and influence are distributed unevenly among individual actors in collectives, shaping the flow of ideas and information in those social networks. Graph theory gives us a rich collection of concepts and metrics to express such influence quantitatively, based on the structural properties of networks. As historians we are interested not only in the structure of particular social networks, but how those networks evolve. With respect to analyzing the behavior and influence of individual actors, this prompts us to ask how different scientists enter existing collaborative networks, and how their structural position within those networks change over time.

One way to pursue this question... (read more)

New Course in Digital & Computational Methods

Advances in computational methods have created exciting new opportunities to apply digital methods not only to historical and philosophical research, but also to discovery and meta-analysis in the life sciences. This hands-on course provides an introduction to digital methods — from managing digital data to computational analysis — for advanced undergraduate and graduate students in the humanities and life sciences. Students are introduced to data and metadata management, text-mining, citation analysis, network analysis, data visualization, and other computational methods. The course balances fundamental theory with hands-on sessions geared toward students’ own research projects. This course is suitable for students in both the sciences and the humanities, and is cross-listed for Biology, History & Philosophy of Science, and English.

Digital + Computational Methods in the Humanities (HPS)
Spring, 2014: BIO/HPS/ENG 498/591
Tuesday/Thursday 5:30pm - 7:00pm

For more information, contact Julia Damerow or Erick Peirson.

The Data Mining and Analytics Team

The Data Mining and Analytics Team focuses on extracting the dynamics of complex systems from real-world data.

We build unique and innovative data systems that capture in unprecedented detail the processes that lead to important scientific innovation. Combining expertise in data wrangling, network science, and advanced statistical modeling, we push at the interdisciplinary boundaries of the life sciences, medicine, clinical research, data science, and digital humanities.


New Publication in Biomedical Research

On March 30th, 2017, Manfred Laubichler, Julia Damerow and Erik Pierson published a paper titled The diversity of experimental organisms in biomedical research may be influenced by biomedical funding.

Contrary to concerns of some critics, we present evidence that biomedical research is not dominated by a small handful of model organisms. An exhaustive analysis of research literature suggests that the diversity of experimental organisms in biomedical research has increased substantially since 1975. . .


Software Development & Trans-disciplinary Training

Julia Damerow, Manfred Laubichler, and Erik Pierson collaborated on a paper titled Software development & trans-disciplinary training at the interface of digital humanities and computer science.

The computational turn in the humanities has precipitated the need for sustainable software development projects that are specifically focused on humanities research problems, and the need for graduate and undergraduate training models that address the trans-disciplinary nature of computational humanities research.

Berlin 11 Open Access Conference

On November 20, 2013, Manfred Laubichler gave a presentation at the Berlin 11 Open Access Conference, titled "Transforming Research and Education in the 21st Century: The Role of Open Access".

Making its scientists’ research findings available for the benefit of the whole of humanity, free of charge whenever possible (Open Access), is a key aspiration of the Max Planck Society. Out of this spirit, the “Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities” was initiated by the Max Planck Society in October 2003.

Austria’s Ahead-of-Its-Time Institute That Was Lost to Nazis

Manfred Laubichler was interviewed for an article by Chelsea Wald about the Biologische Versuchsanstalt in Vienna, published in the recent issue of Nautilus. From the article:

In 1911, Popular Science Monthly published an enthusiastic description of a young, private experimental-biology institute in Vienna, lauding its “remarkable scientific productivity resulting from only eight years of research.”

An early air-conditioning system used to control air temperatures at the Vivarium

The author, zoologist Charles Lincoln Edwards, attributed the success of the Biologische Versuchsanstalt (Insitute of Experimental Biology) to its many advanced experimental devices. The institute, popularly known as the Vivarium, boasted a wide range of terrariums, which housed hundreds of organisms, from glow-worms to kangaroos, at strictly controlled temperatures, humidity, pressure, and light levels. That wasn’t always easy—the Vivarium had to adopt or invent many cutting-edge technologies, including an early air-conditioning system. It was “a pioneer in the use of the carbonic-acid cooling machine for maintaining a cold environment,” wrote Edwards. With the help of circulating salt water and a condenser, four rooms were kept at temperatures ranging from 5°C to 20°C.

The idea of using various apparatuses to control the living conditions of plants and animals for study was new; before that, scientists mainly observed their subjects in nature. At the Vivarium, the focus was on raising many generations under the same conditions in order to probe questions of heredity and development—a unique approach at the time, and one that many consider a precursor to today’s research on evolutionary developmental biology, or “evo-devo.”

New Publication in Studies in History and Philosophy of Life Sciences

Plasticity, stability, and yield: The origins of Anthony David Bradshaw's model of adaptive phenotypic plasticity

A new paper by Erick Peirson in Studies in History and Philosophy of Life Sciences is available online.

Plant ecologist Anthony David Bradshaw's account of the evolution of adaptive phenotypic plasticity remains central to contemporary research aimed at understanding how organisms persist in heterogeneous environments. Bradshaw suggested that changes in particular traits in response to specific environmental factors could be under direct genetic control, and that natural selection could therefore act directly to shape those responses: plasticity was not “noise” obscuring a genetic signal, but could be specific and refined just as any other adaptive phenotypic trait. In this paper, I document the contexts and development of Bradshaw's investigation of phenotypic plasticity in plants, including a series of unreported experiments in the late 1950s and early 1960s.

For those without an institutional subscription to Elsevier, see the archived preprint.

New Review in Quarterly Review of Biology

Erick Peirson's review of Ted R. Anderson's The Life of David Lack: Father of Evolutionary Ecology was published in the March issue of the Quarterly Review of Biology. See the full review here (paywall).

David Lack (1910–1973) was a British ornithologist whose research on population biology was part of a broader set of attempts in mid-20th century to integrate neo-Darwinian evolutionary theory into explanations of the distribution and abundance of species. In The Life of David Lack, ecologist Ted R. Anderson asserts that Lack should be appreciated as the “father of evolutionary ecology,” pointing to his 1947 book Darwin's Finches and his 1947 paper titled “The Significance of Clutch-Size” as evidence for his claim. Central to both of those works was the idea that the demographic and reproductive characteristics of a species are best explained in terms of maximizing the reproductive fitness of individual organisms, and the idea that the mechanisms of this selective process can be studied through experimental manipulations in the field.

New article in Biology & Philosophy

PhD candidate Christopher Dimond is a co-author on a recent paper in Biology & Philosophy, titled "Pluralism in evolutionary controversies: styles and averaging strategies in hierarchical selection theories."

Two controversies exist regarding the appropriate characterization of hierarchical and adaptive evolution in natural populations. In biology, there is the Wright–Fisher controversy over the relative roles of random genetic drift, natural selection, population structure, and interdemic selection in adaptive evolution begun by Sewall Wright and Ronald Aylmer Fisher. There is also the Units of Selection debate, spanning both the biological and the philosophical literature and including the impassioned group-selection debate. Why do these two discourses exist separately, and interact relatively little? We postulate that the reason for this schism can be found in the differing focus of each controversy, a deep difference itself determined by distinct general styles of scientific research guiding each discourse. That is, the Wright–Fisher debate focuses on adaptive process, and tends to be instructed by the mathematical modeling style, while the focus of the Units of Selection controversy is adaptive product, and is typically guided by the function style. The differences between the two discourses can be usefully tracked by examining their interpretations of two contested strategies for theorizing hierarchical selection: horizontal and vertical averaging.

New Publication in Creative Education: A Global Classroom for International Sustainability Education

Manfred Laubichler and Guido Caniglia are co-authors on a recent paper in the journal Creative Education: A Global Classroom for International Sustainability Education.

A brief review of international sustainability education options currently available to students reveals a gap between the knowledge students may need to succeed in a globalized world and the opportunities available. Into this landscape, we introduce The Global Classroom, an international collaboration between Leuphana Univer-sity of Lüneburg in Germany and Arizona State University in the US. The project strives for an interdis-ciplinary and cross-cultural approach to equipping students with the knowledge, skills, and attitudes re-quired to take on sustainability challenges in international settings. We discuss the structure and organiza-tion of the Global Classroom model and share preliminary experiences. The article concludes with a re-flection on institutional structures conducive to providing students with the international learning oppor-tunities they may need to tackle sustainability problems in a globalized world.

New Publication: Genetic and developmental basis of F2 hybrid breakdown in Nasonia parasitoid wasps

Gibson, J. D., O. Niehuis, B. R. E. Peirson, E. I. Cash, J. Gadau. 2013. "Genetic and developmental basis of F2 hybrid breakdown in Nasonia parasitoid wasps." Evolution: Available Online.

Speciation is responsible for the vast diversity of life, and hybrid inviability, by reducing gene flow between populations, is a major contributor to this process. In the parasitoid wasp genus Nasonia, F2 hybrid males of Nasonia vitripennis and Nasonia giraulti experience an increased larval mortality rate relative to the parental species. Previous studies indicated that this increase of mortality is a consequence of incompatibilities between multiple nuclear loci and cytoplasmic factors of the parental species, but could only explain ∼40% of the mortality rate in hybrids with N. giraulti cytoplasm. Here we report a locus on chromosome 5 that can explain the remaining mortality in this cross. We show that hybrid larvae that carry the incompatible allele on chromosome 5 halt growth early in their development and that ∼98% die before they reach adulthood. On the basis of these new findings, we identified a nuclear-encoded OXPHOS gene as a strong candidate for being causally involved in the observed hybrid breakdown, suggesting that the incompatible mitochondrial locus is one of the six mitochondrial-encoded NADH genes. By identifying both genetic and physiological mechanisms that reduce gene flow between species, our results provide valuable and novel insights into the evolutionary dynamics of speciation.

Paper in ISIS: "Computational Perspectives in the History of Science" by Laubichler, Maienschein, and Renn

You can read the full text online (open access!) here.

Abstract. Computational methods and perspectives can transform the history of science by enabling the pursuit of novel types of questions, dramatically expanding the scale of analysis (geographically and temporally), and offering novel forms of publication that greatly enhance access and transparency. This essay presents a brief summary of a computational research system for the history of science, discussing its implications for research, education, and publication practices and its connections to the open-access movement and similar transformations in the natural and social sciences that emphasize big data. It also argues that computational approaches help to reconnect the history of science to individual scientific disciplines.