Have a personal or library account? Click to login
The Overlooked Genre of Calls-for-Papers: New Grounds to Study Shifts and Connections in Academic Discourse Cover

The Overlooked Genre of Calls-for-Papers: New Grounds to Study Shifts and Connections in Academic Discourse

Open Access
|Mar 2025

Full Article

(1) Context and motivation

Every year, academics travel around the world to meet at different conferences, panels, and conventions. They gather to present findings and discuss new ideas. These gatherings are a key element to the functioning of academia as they mobilize the network of production, distribution and storage of knowledge. It is, after all, through conferences, colloquia, and journals that universities combat isolation and have their research tested and challenged by other members of the community. A silent, often invisible actor of this world are the calls-for-papers (CfPs). These short texts, written with the intention of “getting the word out there” and connecting with academics all over the world, constitute an academic form through which scholars kickstart ideas and argue for the relevance of a particular topic. A CfP is a short, promotional text that looks for new, original academic papers to be presented at a convention, conference, or panel, or to be published in an edited volume or special issue of a journal. CfPs may very well be responsible for activating the whole process of research production and innovation, while serving as a bridge between scholars that are working on similar projects. As a genre, the influence of CfPs in the outputs of academic production is yet to be studied. This paper, which details the collection of a large sum of CfPs in humanities fields, aims to open avenues of investigation by looking at how CfPs may shape the formation of particular discourses in academic circles, as well as how they may foreshadow the future turns in academic discourse.

This study joins other work in the humanities that has focused on administrative genres, such as Laura McGrath’s research on book deal announcements (BDAs). In “‘Books About Race’: Commercial Publishing and Racial Formation in the 21st Century” (2023), McGrath analyzes hundreds of BDAs—short announcements of deals between publishers and writers— to study the discourse around race that circulates within their production. BDAs function as promises of future books that have in most cases not even been written. McGrath sees value in studying these documents as they provide insight into “the assumptions that govern the processes of book acquisition and promotion” (p. 773). Furthermore, she examines the category of “books about race” in order to understand how it works as a “market category” and how “the publishing industry contributes to the ongoing project of racial formation in the US” (p. 773). To that end, McGrath uses two corpora —one which includes BDAs containing explicit racial language and one which does not— to understand how race as a category is marketed, defined, and mobilized in the predominantly white industry of big, commercial publishers that have a mostly white audience in mind (p. 790). By studying a specialized, administrative genre, McGrath is able to identify the influence that BDAs have on shaping the public discourse around race (p. 790). In this sense, McGrath’s article provides a blueprint of the affordances of studying parallel genres like CfPs. CfPs also function as promises of future discussions and provide insight into how knowledge circulates in the academic “market” of papers, articles, and conferences. As McGrath does with BDAs, one could also study how CfPs shape particular discourses within academic networks (like “conferences about race” or “panels about critical theory”). As McGrath also argues, studying BDAs is relevant regardless of whether the books differ or not from their initial announcement, because the BDA itself shows “which expressions or forms of creation are deemed profitable and, subsequently, granted permission to shape the field” (p. 777). A similar logic, while less subject to commercial interests, can be applied to CfPs. For example, future researchers may use this dataset to ask and answer questions such as: Are we talking about “theory” in a different way than 20 years ago? Are we more interdisciplinary than in the past? What kind of questions have been fundamental for the field of critical race studies?

As many scholars have pointed out, data cleaning, or “data carpentry” —as Karsdorp et. al. (2021) call it— is not a neutral or objective process. In what follows, after describing the dataset, its method of collection, and its limitations, I then conduct an initial exploration of the dataset, explaining my curatorial decisions. Finally, I conclude by discussing the potential implications this dataset might have for future research.

(2) Dataset description

The dataset is a collection of “calls-for-papers” housed by the University of Pennsylvania since 1995. It consists of 86,290 postings that were advertised by hundreds of universities and organizations from June 28, 1995, to April 25, 2024. The University of Pennsylvania created this site to gather CfPs in the fields of English, American Studies, and the humanities in general. The UPenn Call-for-Papers Website (https://call-for-papers.sas.upenn.edu/) was launched in 1995 and has since become one of the most popular places to post an advertisement for future conferences in the humanities. It is difficult to know with certainty the total number of conferences, panels, and book collections that are advertised every year in the humanities, as this information is not centralized anywhere. The UPenn CfP website, however, is remarkable because it is the only of its size focusing exclusively on the humanities, and because they have archived every post made from the beginning. While other popular online CfP databases exist (e.g. The CfP List, Call4 Paper), they purge their databases of outdated calls, making them host only a couple hundred of CfPs at any given time.

Note: Full dataset and metadata description, data provenance and decision logs can be found on the repository (see README files).

Repository location

https://doi.org/10.7910/DVN/DFSMBN

Repository name

Harvard Dataverse

Object name

Calls-for-Papers Dataset.

Format names and versions

CSV, ipynb, txt

Creation dates

Data collection: April, 2024. Cleaning and exploration: April–December, 2024.

Dataset creator

Juan Pablo Albornoz, Cornell University

Language

English

License

CC-BY-NC

Publication date

2024-17-10

(3) Method

Users entering the UPenn CfPs website (https://call-for-papers.sas.upenn.edu/) can browse the CfPs in different ways. The homepage gathers the most recent posts in the form of a list, with 30 posts per page, from the newest to the oldest. These posts include a title, a deadline for submission, and a shortened description. Users can click on specific posts to see the full description of each CfP. Every post is also tagged with one or more of the 41 available categories. These include specific fields (e.g. medieval, modernist, or postcolonial studies), general areas or topics (e.g. interdisciplinary, theory, or cultural studies), and type of submission (e.g. graduate conferences, journal articles, or awards).

For each CfP, I scraped both the shortened version and the full-length description. This resulted in 86,290 scraped posts from June 28, 1995, to April 25, 2024,1 which were downloaded into a .csv file that organized the information according to categories such as “year”, “content” or “category”.2 The cutoff date has no special meaning other than being the day the scraping took place. Data cleaning and curating took place sporadically between April 2024 and January 2025. I have made all CfP data, as well as the scripts I used to scrape and explore the data, available in this paper’s data repository, not only for review but also to help anyone interested in downloading posts after the cutline. The exploratory method involved creating scripts that counted and graphed relative and absolute frequencies of the category tags of the CfPs. Cleaning decisions included deduplicating entries, adding ID tags, and creating a column for the year published. I relied on the Python packages of Matplotlib (Hunter, 2024) and Networkx (Hagberg et. al., 2008) to create the graphs in section 4. I also did content analysis with a sample of 100 CfPs, creating content codes that could be explicitly found on the CfPs that would help me describe them better as genre.

(3.1) Limitations

  • Representation. While this dataset gathers a large amount of information about CfPs in the humanities, not every CfP in the field is promoted on the website. The information in the dataset is limited to what is posted in the UPenn CfP website. Moreover, data curation and cleaning were influenced by the way the website catalogues and presents the CfPs. That is, the characteristics and standard form of a CfP that I describe in the first section are also dictated by the way the site gathers information from users.

  • Geography and Language. The website only accepts CfPs in English, and while there is not an explicit geographical restriction, most CfPs submitted to the site come from the United States or Europe. This limitation, however, is an opportunity for further research. For example, future research could explore comparable datasets from the Global South, or could use this dataset to investigate how the Global North promotes or discusses topics or literature about the Global South.

  • Time. This dataset does not provide any information on CfPs before 1995 or after April 25, 2024. Moreover, any analysis made between 1995 and 2005 may suffer from underrepresentation because of the site’s still incipient nature. As you can see in the next section, yearly posts to the site did not go over a thousand until 2005. Expanding this dataset into the future is simple; it is much more challenging, nearly impossible, to build one of comparable standards before 1995 (or even before 2000).

  • Type. The history of CfPs predates the internet era. This dataset is restricted to online material that is organized by the submission requirements of the UPenn CfP Website. It does not consider any kind of printed CfPs, or other online material. The characteristics of CfPs reached through the exploration of this dataset are limited to the particular online form that gets published on the website.

(4) Results and discussion

(4.1) 100 CfPs sample

In order to better describe this administrative genre, I closely read 100 CfPs from the dataset. I randomly chose 30 from the early years (1995–2005), 30 from the middle ones (2006–2015), and 40 from the recent years (2014–2024). I took notes on their structure by creating binary codes that described the presence or absence of formal and content features of the CfPs (for example, “Does the CfP justify the relevance of its topic?” “Does it ask questions, or does it propose topics?” “Is it specific or general?”).3 The formal characteristics of the CfPs housed by UPenn are always the same, in this order: title, organizer, deadline for submission, description, and category tags (See Figure 1). Many of them, especially when they relate to a conference or panel, begin with a discussion of the relevance and timeliness of their topic and end with a list of possible topics or questions that submitters may address. For example, the CfP “War, Nursing, Narrative”,4 which sought papers to present at MMLA in 2024, starts by asserting the importance of studying nursing narratives during wartime as a problem of seeing bodies not as a thing but as a “situation”. Then, the CfP specifies the theoretical inflection of the papers they seek (“trauma theory” or “affect theory”) and ends by listing a set of possible topics to address (“the nurse as narrator”, “the nurse as façade”, “the nurse as carer”). Other CfPs, like “Black Feminist Excesses”5, a call for a working group at MLA 2025, ask guiding questions for submitters, like “How does Black feminism and womanism engage disparate, wayward, or fringe forms of identity, embodiment, materiality, affect and culture?” Finally, descriptions usually include other practical information, like the kind of material expected (abstract, paper, etc.), and how interested people might find more information on the subject.

johd-11-278-g1.png
Figure 1

Formal features of a CfP. Screenshot: UPenn CfP Website (Dataset Unique Id a3d08a43; Website: https://call-for-papers.sas.upenn.edu/cfp/2018/03/12/modernisms-in-fact).

Based on my notes on 100 CfPs and the general exploration of the dataset, I can summarize the characteristics of CfPs as follows:

  • CfPs are formulaic. They tend to be short and concise, and they have a similar structure. With an average word count of 359, CfPs authors must combine descriptive, informative, and promotional rhetorical strategies into a coherent, clear, and readable text. Their structure generally includes practical information (when, where, how to submit), the description of a problem and topic of interest, and a few key research questions or perspectives on the topic.

  • CfPs are advertisements. They seek to attract the best possible submissions in their area of interest. Therefore, as a genre, they place special attention in noting why the theme they are advertising is relevant. The website works as a marketplace in which CfPs compete with one another. The market logic of the CfPs involves the consideration of target audiences, institutional affiliations, and funding opportunities.

  • CfPs look for original content. This implies that CfPs often draw lines that identify gaps in knowledge. Through the very way that they frame research questions or perspectives and how they describe the topics and relations that they look for, CfPs already carve the hermeneutical directions that will produce new content and discussions in the future.

  • CfPs function as promises of future discussions. “Promises”, Hanna Arendt said, “are the uniquely human way of ordering the future” (Arendt 1972, p. 92). In this sense, CfPs organize the future outlooks and outputs of academic production around the world. They spark the engine of the academic machine of knowledge production. CfPs are material evidence of the spirit of universities around the world; that is, of their drive to find new things and reach new conclusions.

  • CfPs are ephemeral. As a form, CfPs exist not only online but are still common inhabitants of bulletin boards across campuses. While they may get a second life (the CfP of an edited collection, for example, often comes after the CfP of a conference), their lifespan is usually dictated by the deadline for submission.

  • CfPs are field-, topic- and institution-oriented. Nearly 100% of the CfPs from the random sample made lists of relevant topics to address, and mentioned their institutional affiliations and fields of interest. Most of the CfPs also discussed the relevance or timeliness of their chosen topic.

(4.2) Dataset Entries

Figure 2 shows the absolute number of posts held by the website over the years. The graph, naturally, does not represent an accurate summary of the number of conferences in the humanities from 1995 to 2024. One of the difficulties of dealing with CfPs is that they are not centralized by one institution. Universities, faculty, and academic conventions from all over the world plan conferences, panels, and book collections every year and often use their own distribution media to advertise them. The UPenn CfP site was launched in 1995, in the middle of the decade where the internet became widespread. It was not a product of a communal effort to centralize CfPs; instead, as the Figure shows, the website grew organically, starting with only five posts in 1995, all of which asked for journal submissions. A big jump occurred between 2002 and 2003, as well as after 2004, reaching a maximum of 9,069 posts in 2006. The posts then stabilized around 4,000 per year for 8 years and have decreased to an average of 2,500 in the last five years. Further research is needed to explain that change, especially to determine if it reflects an actual downsizing of conferences around the country or if the CfPs have moved elsewhere, like social media. There is also no clear evidence that the pandemic had a significant effect on the number of CfPs promoted, as they only had a small decrease during 2020.

johd-11-278-g2.png
Figure 2

Number of CfPs per year uploaded to the UPenn CfP Website, June 1995–April 2024.

A closer analysis of the data reveals more limitations that we must take into account. Trying to understand why 2006 was the year with the most absolute number of posts and why there was a big drop after 2008, I realized inspecting the data manually that people submitting CfPs before 2009 could only tag their posts with one category from the set of approved categories. However, many submitters were interested in tagging their CfPs with more than one category. Therefore, many scholars uploaded the same post several times, each marked with a different category. Starting in 2009, the site was improved to let submitters tag their posts with several categories. This example shows that the operational use of the website (the ways users engage with it) may affect any conclusion reached through analysis of this dataset. In order to correct this issue, I created a deduplicated dataset. The python script recognized duplicated entries and merged them into one, adding the unique categories to each.6 The figure below shows the data without duplicated entries.

The script recognized a total 17,425 duplicated entries before 2009, out of a total of 30,823. This makes the number of unique total entries 68,865. The results of the next sections come from the deduplicated dataset.

(4.3) Category Tags

Figure 4 maps the frequency of the top 15 category tags by absolute count over the years.7 The big jump from 2008 to 2009 in the number of categories is not only a consequence of an increase in the absolute number of CfPs (as Figure 3 shows, the difference between the number of submissions between 2008 and 2009 is only about 1000). It is also a consequence of the site allowing posts to be tagged with multiple categories. It is important to note that the number of categories available has shifted through the years, as administrators update the site and add new ones to the list. Table 1 outlines when each category entered the website and the number of years that they have been available. The growing number of categories is also evidence of how the site has continually diversified and how administrators continue to adapt to logics of supply and demand.

johd-11-278-g3.png
Figure 3

Duplication fix. Number of CfPs per year uploaded to the UPenn CfP Website, June 1995–April 2024.

johd-11-278-g4.png
Figure 4

Top 15 category tags by absolute count over the years, June 1995–April 2024.

Table 1

Year each category was made available, and number of years each has been available, on the UPenn CfP Website.

CATEGORYYEAR ADDEDYEARS AVAILABLE
Romantic199530
Journals and collections of essays199530
Humanities, computing and the internet199629
Eighteenth century199629
Twentieth century and beyond199629
Medieval199629
Cultural studies and historical approaches199629
American199629
Gender studies and sexuality199629
renaissance199728
Theatre199728
Postcolonial199728
Professional topics199728
Bibliography and history of the book199827
Ethnicity and national identity199827
Poetry199827
Victorian199827
Theory199827
Film and television199827
Graduate conferences199827
Travel writing199926
African-American199926
Religion200025
Science and culture200025
International conferences200223
children’s literature200223
Rhetoric and composition200718
General announcements200718
Ecocriticism and environmental studies200916
Classical studies200916
Popular culture200916
Modernist studies201015
Interdisciplinary201015
Translation studies20169
Pedagogy20169
World literatures and indigenous studies20169
English education20169
Fan studies and fandom20169
Online conferences20214
Veterans’ studies20223
Awards20241

Studying the category tags by absolute count gives a few interesting insights. As Figure 5 shows, the broadest categories have a larger absolute count, like “cultural studies and historical approaches”, “American”, or “interdisciplinary”. In fact, the “cultural studies” category appears in nearly half of the total entries. More specific, narrow, or niche categories have a smaller absolute count, like “veterans’ studies”, “online conferences”, and “translation studies”. When considering the year they were added, one should not ignore that a category like “interdisciplinary”, added in 2010, is already the second largest after less than 15 years, while, for instance, a category like “Victorian”, introduced in 1998, has barely surpassed 2000 posts in almost 30 years. Another surprising finding, perhaps, is seeing categories like “children’s literature” or “popular culture” on the top spots, above other tags that could be more readily associated with humanities scholarship, such as “theory” or “poetry”.

johd-11-278-g5.png
Figure 5

Absolute counts of top 15 category tags, CfPs Dataset.

Because the number of categories is not consistent through the years, looking at the relative count may help to get a better grasp of the weight of different categories over time. One important finding is that after 2005 the category tags are spread quite uniformly; that is, there is not one that is overly dominant, as they all stay at or below 10% of the total count of category tags for over 20 years. Only one category, “journals and collections of essays”, stands out as dominant in the first years of the website, which suggests that only after 2005 the site became a huge repository for conference and panel submissions. A look at the total relative category distribution (see Figure 6) confirms the diverse and interdisciplinary nature of the CfPs. The very broad “cultural studies and historical approaches” gets the highest relative count at 11.9%, and all other categories do not go over 9% of the total.

johd-11-278-g6.png
Figure 6

Frequency of top 15 category tags relative to total amount of category tags, CfPs Dataset.

Of course, one can also use the dataset to look more closely at specific categories that might be of interest to researchers. For example, Figure 7 plots the relative counts of just three categories: “interdisciplinary,” “theory,” and “twentieth century and beyond.” While a category like “twentieth century and beyond” has an overall relative frequency of just over 3%, it accounted for nearly 20% of the entries in 2006. The ‘theory’ category saw a peak in 2005, decreasing significantly over the years until rebounding in 2015. The ‘interdisciplinary’ category, on the other hand, has seen a steady increase since it was introduced in 2010, already surpassing the others in its first year of existence. This does not imply that, for example, the academic world has turned its attention away from theory-centric research and towards a more interdisciplinary approach. However, this data can provide a point of entry to further investigate the movements of these discussions over time.

johd-11-278-g7.png
Figure 7

Relative frequency of three category tags, per year. CfPs Dataset.

Finally, looking at which categories are often associated with others provides insights into how some areas of study are connected or isolated in the academic landscape. As a first approach, I calculated the number of times each category appeared standalone –that is, as the sole category tag for a CfP (See Figure 6). Most posts in the dataset are tagged with multiple categories. Only about 14% (or 9,706) of the CfPs were tagged with only one category. Of those tagged with only one category, ‘Medieval’, ‘Renaissance’, and ‘Victorian’ were the categories in which this occurred more frequently. Most categories appeared standalone less than 5% of the time. As could be expected, a category like “interdisciplinary” only appeared standalone 133 times, or 0.6% (Figure 8).

johd-11-278-g8.png
Figure 8

Percentage of standalone category tags (categories > 5%), CfPs Dataset.

Analyzing which categories are associated with one another also proves useful for studying how different fields influence each other. A network graph like the one below, in which I included 5 random categories from the set, serves as an example of how researchers might study the connections between their fields of interest (See Figure 9). As the graph shows, a category like “theory”, which one could presume has more plasticity than others such as “classical studies” or “professional topics”, was connected more often with the other categories and therefore occupies a central place in the network graph.

johd-11-278-g9.png
Figure 9

Example of a network graph of frequency of connections between selected category tags, CfPs Dataset.

(5) Implications/Applications

My initial data exploration is only a small subset of what this dataset affords. As I argue in the introduction, the dataset provides entry into a very understudied genre that may give useful context about the ways the academic world functions. Conferences, panels, and book collections are a fundamental part of academic work and show how knowledge is produced collectively. As promises for future discussions, studying the production of CfPs in real time will tell us more about where the humanities are going than where they are in the present. As media theorist Jussi Parikka (2020) has argued, “universities consist of a changing set of practices and techniques programmed into students and future staff” (p. 59). CfPs serve to mark a distinction between academic and non-academic strands of knowledge, and they are always conditioned by the practices and techniques of different habits (gestures of opening up ideas for discussion, new excavations, or further rereadings), gatekeeping, and networking. With that being said, I want to outline a few specific areas that my dataset can contribute to.

(5.1) Turns and trends

A quick search on the Cornell University Library website of the word “turns”, filtered by “humanities”, returns recent articles with words in their titles such as “The Decolonial Turn”, “The Interdisciplinary Turn”, or the “Posthumanist Turn”. Rather than “trends”, scholars in the humanities prefer the word “turn” when talking about shifts that affect the academic landscape. As Doris Bachmann-Medick (2016) has argued, turns transform theory, set new courses, and restructure academic fields while also setting them as a place of competition and conflict. A turn describes a gathering motion —a flocking to, an increase in momentum— around something that promises to revolutionize the field. In practical terms, turns describe the perception that many academics are concentrating on a theory or method through which they approach their jobs. In a critique of the turn to affect, for example, Ruth Leys begins with a question: “Why are so many scholars today in the humanities fascinated by the idea of affect?” (2011, p. 435). A turn, then, speaks about the academic world in motion —as one can see from Leys’s question, the turn to affect is happening in the moment (today) and is recognizable because of an increased activity of scholars around it (a fascination). This means, in other words, that a significant number of presentations and conferences about affect must have taken place. However, this also means that when turns appear in publications like these they have always already occurred.

This dataset has the potential to contribute to the study of “turns” through a different vantage point. Scholars can use this database to explore how turns came about in specific years, in the motion between conferences and discussions. Leys, for example, attributes the launch of the “turn to affect” to work made by Eve Sedgwick in the late nineties (Leys, 2017, p. 2). Searching for the kinds of conferences that called for original work on affect would illustrate the extent to which Sedgwick’s work was fundamental, and, more generally, how specific questions, perspectives, or conferences shaped the direction and contours of what we now know as the turn to affect. As a hypothesis for future research, following CfPs in real time might give us glimpses of future turns to come.

(5.2) Network analysis and data archaeology

Academic papers are always explicitly networks —that is, no academic paper pretends to work in isolation. This is made clear by the fact that every academic paper has a reference list and engages with other texts and authors. However, the network into which every academic paper enters is not limited to the explicit references it makes. Every paper has an invisible history to it. Scholars in the humanities are often archaeologists of texts. While we are used to following literary works through their networks of production, reception, and distribution, we are less used to doing this kind of work with academic papers. This has the effect of making academic papers appear as static, finished works, born out of the mind of a scholar. This dataset provides information that could interest scholars wanting to trace articles to their beginnings. One could use a specific CfP to see the relationship between a certain set of questions, the conference papers that were produced around them (if such a record exists), and the peer-reviewed articles that followed them. We could then have a better picture of how our academic network moves and grows through time.

Moreover, universities are also always networks. Friedrich Kittler, who amply studied the history of universities, has noted how medieval universities were born out of acquiring the knowledge storage and distribution monopoly from monasteries through the establishment of copying departments and exclusive postal networks (1996, p.6; 2004, p. 245). The question of how to produce, store, and distribute knowledge has been nothing short of fundamental for the functioning and expansion of universities. From a media studies perspective, a history of CfPs and their relation to specific technologies of communication is yet to be made.

Finally, because this dataset includes information on connected authors, places, and conventions, as well as the studied category tags, scholars could apply social network analysis for its study. While traditional social network analysis usually considers small populations, work in the past decades has been successful in applying it to big populations, such as in the study of networks of scientific collaborations, recorded in real time by electronic databases (Barabási et. al., 2006, p. 6).

(5.3) Content analysis and topic modeling

Taking a page from Laura McGrath’s work on book deal announcements, this dataset also provides an opportunity to engage in content analysis. Researchers could manually assign descriptive codes to a corpus of CfPs of their interest; as an example, one could complement McGrath’s research by trying to differentiate which CfPs could be tagged as “CfPs about race”, and how this informs the racial discourse within academia.

Topic modeling could also be an interesting way to study the CfPs as a genre. For instance, using a dataset of over 21,000 scholarly articles, Ted Underwood and Andrew Goldstone have used topic modelling to study shifts in literary studies over 120 years (Goldstone and Underwood, 2014). Although the category tags are already a kind of “topic” that organizes the collection, using unsupervised topic modeling can lead to interesting questions. For example, one may wonder what kinds of words revolve around a particular topic, or what kind of unmarked categories might appear from an unsupervised analysis. Because the structure of this dataset depends on the decisions of administrators to name and include the category tags, topic modeling is a good solution to map the discursive contours of this collection without being limited to the tagging decisions of the website administrators.

(6) Conclusion

The Call-for-Papers dataset provides an exciting opportunity to engage in interdisciplinary research. It provides a new direction to combine statistical, computational, and humanistic research methods to better describe the contours that govern the movements of the inner workings of university life. We have not studied yet how a genre like CfPs might impact the amount and type of academic research in the humanities or other areas across the USA and the world. We do not know if certain types of CfPs are conducive to better or worse research outputs. We do not know to what extent the CfPs may reflect or frame the discourse of how we talk about or interpret literature. Applying computational methods and close reading to this understudied genre will allow us to test our assumptions of the current state of research in the humanities and will let us have a better grasp on how the ways we “call” for original content shape the literary and cultural fields.

Notes

[1] That is, 86,290 short versions and 86,290 full-description versions for a total of 172,580 individual entries.

[2] Full data and metadata descriptions are available in the repository.

[3] You can find the full notes and graphs on this sample in the dataset repository (close_reading_notes.xlsx).

[4] Dataset Unique Id: 2613482488.

[5] Dataset Unique Id: 387532129.

[6] The deduplication script (“deduplication_script.ipynb”) treats as duplicates the entries that have the same information under the “content” tag.

[7] You can find more graphs and the information for all 41 categories in the repository.

Acknowledgements

I wish to thank Dr. Lindsay Thomas (Cornell University) for her invaluable help, guidance, and feedback throughout this project. My gratitude also goes to my supervisor Dr. Caroline Levine. Finally, I would also like to thank Luis Sanmiguel for helping me resolve coding issues while exploring the data.

Competing Interests

The author has no competing interests to declare.

Author Contributions

Juan Pablo Albornoz: Dataset creator, data curation, analysis, investigation, writing.

DOI: https://doi.org/10.5334/johd.278 | Journal eISSN: 2059-481X
Language: English
Submitted on: Nov 11, 2024
Accepted on: Feb 3, 2025
Published on: Mar 5, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Juan Pablo Albornoz, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.