Have a personal or library account? Click to login
Understanding teams and productivity in information retrieval research: Academia, industry, and cross-community collaborations Cover

Understanding teams and productivity in information retrieval research: Academia, industry, and cross-community collaborations

By: Jiaqi Lei,  Liang Hu,  Yi Bu and  Jiqun Liu  
Open Access
|Oct 2025

Figures & Tables

Figure 1.

Flow chart of data processing. * indicates the focus of this current paper.
Flow chart of data processing. * indicates the focus of this current paper.

Figure 2.

Distribution of the number of papers in three types over the years.
Distribution of the number of papers in three types over the years.

Figure 3.

The distribution of conference frequency.
The distribution of conference frequency.

Figure 4.

The top-10 published conferences in the three categories.
The top-10 published conferences in the three categories.

Figure 5.

Heat map of the correlation coefficient matrix for the three types of articles.
Heat map of the correlation coefficient matrix for the three types of articles.

Figure 6.

Conversion rate for the three types of articles.
Conversion rate for the three types of articles.

Figure 7.

Variation of cosine similarity with year for three types of articles. “Academic-industry” indicates the similarity between publications by authors purely from academia and publications by authors purely from industry. “Industry-collaboration” indicates the similarity between publications by authors purely from industry and publications co-authored by scientists from academia and industry. “Collaboration-academic” refers to the similarity between publications co-authored by scientists from academia and industry and publications by authors purely from academia.
Variation of cosine similarity with year for three types of articles. “Academic-industry” indicates the similarity between publications by authors purely from academia and publications by authors purely from industry. “Industry-collaboration” indicates the similarity between publications by authors purely from industry and publications co-authored by scientists from academia and industry. “Collaboration-academic” refers to the similarity between publications co-authored by scientists from academia and industry and publications by authors purely from academia.

Figure 8.

Distribution of the number of co-authors for each type of publications. Since the Academia-Industry Collaboration is defined as having at least one author from academia and one author from industry, the blue curve representing Academia-Industry Collaboration starts with a number of co-authors of two.
Distribution of the number of co-authors for each type of publications. Since the Academia-Industry Collaboration is defined as having at least one author from academia and one author from industry, the blue curve representing Academia-Industry Collaboration starts with a number of co-authors of two.

ACM article keywords/phrases (ranked in an alphabetical order by column; all keywords/phrases lowercased)_

accountability information retrievalexplainability information retrievalnavigationsentiment analysis
active learningexploratory searchneural networksimilarity
adaptationeye trackingneural networkssimilarity measure
annotationfaceted searchnoveltysimilarity search
annotationsfairness information retrievalonline social networkssocial media
audiofeature extractionontologiessocial network
augmented realityfeature selectionontologysocial network analysis
benchmarkfederated searchopen datasocial networks
big datafilteringopinion miningsocial search
blogflickroptimizationsocial tagging
browsingfolksonomyP2Pspam
cachinggeographic information retrievalpagerankspoken search system
CBIRgraph miningpassage retrievalsponsored search
childrengroup recommendationpeer-to-peersummarization
classificationhashingperformancesupervised learning
cloud computinghuman-computer interactionperformance evaluationSVM
clusteringimage annotationpersonal information managementtagging
collaborationimage classificationpersonalizationtags
collaborative filteringimage retrievalpersonalized searchtest collection
collaborative taggingimage searchprivacytest collections
community detectionimplicit feedbackpseudo relevance feedbacktext categorization
complex event processingindexpseudo-relevance feedbacktext classification
content analysisindexingquerytext mining
content-based filteringinformation extractionquery classificationtime series
content-based image retrievalinformation filteringquery expansiontopic model
content-based retrievalinformation retrievalquery formulationtopic modeling
contextinformation seekingquery intenttopic models
context-awarenessinformation visualizationquery log analysistransfer learning
conversational information retrievalinteractionquery logstransparency information retrieval
convolutional neural networksinteractive information retrievalquery performance predictiontrust
correlationinteroperabilityquery processingtwitter
credibilityinverted indexquery reformulationunsupervised learning
cross-language information retrievalkernel methodsquery suggestionusability
cross-modal retrievalkeyword searchquestion answeringuser behavior
crowdsourcingknowledge baserandom walkuser interaction
data integrationknowledge managementrankinguser interface
data mininglanguage modelRDFuser interfaces
databaselanguage modelingrecommendationuser modeling
deep learninglanguage modelsrecommendation systemuser profile
digital humanitieslearningrecommendation systemsuser profiling
digital librarieslearning to rankrecommender systemuser studies
digital librarylifeloggingrecommender systemsuser study
digital preservationlink analysisrelation extractionvideo
dimensionality reductionlinked datarelevancevideo analysis
distributed information retrievallocality sensitive hashingrelevance feedbackvideo annotation
diversificationlocation-based servicesre-rankingvideo retrieval
diversitylog analysisresponsible information retrievalvideo search
document clusteringmachine learningretrievalvideo summarization
document representationmachine translationretrieval modelsvisualization
document retrievalMapReducesamplingweb
e-commercematrix factorizationscalabilityweb 2.0
educationmeasurementsearchweb mining
efficiencymetadatasearch behaviorweb search
e-governmentmobilesearch engineweb search engine
emotionmobile computingsearch enginesweb service
enterprise searchmobile devicessemantic relatednessweb services
entity linkingmultimediasemantic searchwiki
ethnics information retrievalmultimedia retrievalsemantic similaritywikipedia
evaluationmusicsemantic webword embeddings
event detectionmusic information retrievalsemanticsworld wide web
eventsmusic recommendationsemi-supervised learningXML
experimentationnamed entity recognitionsensor networksXML retrieval
expert findingnatural language processing

Research topics over time_

AcademiaIndustryAcademia-Industry Collaboration
2000intelligent libraries library classes library technologies novel browser internet classroomsvideo classroom video watermarking video performance video technical video recordinglearning algorithms analysis hashing proliferation internet classifies algorithms learning algorithm
2001mouse popular mouse 3d popular multiplayer 3d computing powerful 3dadvanced algorithms computationally feasible researchers improve modeling useful interface powerfulsoftware engineers designer needs designing ontology xml software documentation engineers
2002libraries tutorial databases attractive library technology efficient indexing databases tutoringdesigning web auctions improving expanding rehearsal bioinformatics emerging search enginesoffering algorithms tackling algorithms semantic web algorithms software web query
2003optimization cancer audition algorithms partitioning algorithms algorithms comparison algorithms haplotypemusic photo2video retrieves songs music concert music database new songssimplifying web internet experiments popular web online semantic new spam
2004algorithm updating novel algorithms algorithms lessons developing algorithms algorithms learningtoolkit debugging browser optimizations search algorithms web verification algorithms methodologysimplifying web internet experiments popular web online semantic new spam
2005valuable indexing algorithms improving efficient ontology interesting research important crosscuttingalgorithms methodology magic instructional data webgazeanalyzer webgazeanalyzer brings laser pointerholistic algorithms novel internetworking smart phones algorithms scalability wireless broadband
2006valuable indexing algorithms improving efficient ontology interesting research important crosscuttingresearch challenging executed twice algorithms counter bioinformatics motivation steep learningchallenge traffic traffic engineering algorithms damping algorithms research engineering algorithms
2007servers cheating privacy vulnerabilities online personalization leakage internet online anonymityhuge database hashing billions largest commerce amazon highly huge collectionsmajor browsers algorithms large algorithms widely extensive programming fastdash developer
2008wikipedia huge browsers popular huge databases favourite websites winning podcastsstudy robots survey robots querybuilder query querylogs bundling botnet detectionalgorithm recommender study novel research recent tagging podcasts researchers paper
2009rfid popular patents important rfid algorithms patents essential tagging expertbrowsers actively growth online online advertising online surveys pushing browsersinternet researchers importance researchers benchmarking browsers research researchers data researchers
2010new algorithms hashtag innovation evolving wiki new apps discovery bioinformaticsresearch revolutionize expensive simulators database huge budget challenging consumption skyrocketedwikiprojects increased pagerank algorithm playlists photoselect brainstorming stylus retagging online
2011design tutorial bioinformaticians designing designing sparql clustering bioinformatics tutorials technologiesdriver safety automobiles tutorial refocus driving driver infotainment opportunistic driverattractive websites understanding internet moderating online good websites modeling internet
2012comics techcommix cancer increased cyberinfrastructure scientist algorithms physicists scientists seekingimproving tweet threefold allows reducing aliasing algorithm reducing strategies threefoldavatar conferencing rearranging videos video hashing video tutorials media tutorials
2013detectives solving algorithms greedy played detectives crime notepad detecting cyberbullyinglatest poll proposed algorithm tweeted wedding motivation graphbuilder research innovationsproject openstreetmap tutorial overview openstreetmap editors pattern openstreetmap modeling tutorial
2014discriminating online misinformation crowdturfing instagram traffic kickstarter interview internet challengingimproving online improvements online google overtaking web analytics wikipedia benefittedpagerank algorithm web searchers search engines search engine online surveys
2015apps revolutionize apps changing simplifying mobile investigating smartphone developing smartwatchimprove genetic experiments expensive expensive experiments novel algorithms quantitative geneticyoutube tutorials movielens netflix netflix datasets netflix dataset youtube flickr
2016learning analytics analytics learning agile analytics learning smartwatches learning evolutiongoogle facebook facebook microsoft facebook conducting traffic staggering economics rigorouslychatbots developers needed mobile designing android prototype chatbot apps study
2017videos rebooting videoconferencing application video tutorials video study videoconferencinggoogle facebook facebook microsoft facebook conducting traffic staggering economics rigorouslyonline bullying facebook misleading bullying twitter online distractions reducingcontroversy homepage
2018videos rebooting videoconferencing application video tutorials video study videoconferencingimportant research seismic interpreters cryptography needed noisy training research challengescrowdsourcing analytics reuse networking cache management cache reuse web transformational
DOI: https://doi.org/10.2478/jdis-2025-0051 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Submitted on: Jun 6, 2025
Accepted on: Sep 19, 2025
Published on: Oct 16, 2025
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Jiaqi Lei, Liang Hu, Yi Bu, Jiqun Liu, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution 4.0 License.

AHEAD OF PRINT