Katy Börner Profile Picture

Katy Börner

  • katy@indiana.edu
  • (812) 855-3256
  • Home Website
  • Victor H. Yngve Professor
    Information and Library Science, School of Informatics and Computing
  • Adjunct Professor
    Statistics

Field of study

  • User modeling, information visualization, virtual reality interfaces, human computer interaction

Education

  • Ph.D. in Computer Science, University of Kaiserslautern, 1997
  • Master of Engineering in Electronics, University of Technology Leipzig, 1991
  • Technical School Examinations, RFT Fernmeldewerk Leipzig, 1987

Representative publications

Visualizing knowledge domains (2003)
Katy Börner, Chaomei Chen and Kevin W Boyack
Annual review of information science and technology, 37 (1), 179-255

This chapter reviews visualization techniques that can be used to map the ever-growing domain structure of scientific disciplines and to support information retrieval and classification. In contrast to the comprehensive surveys conducted in traditional fashion by Howard White and Katherine McCain (1997, 1998), this survey not only reviews emerging techniques in interactive data analysis and information visualization, but also depicts the bibliographical structure of the field itself. The chapter starts by reviewing the history of knowledge domain visualization. We then present a general process flow for the visualization of knowledge domains and explain commonly used techniques. In order to visualize the domain reviewed by this chapter, we introduce a bibliographic data set of considerable size, which includes articles from the citation analysis, bibliometrics, semantics, and visualization literatures. Using tutorial style …

Mapping the backbone of science (2005)
Kevin W Boyack, Richard Klavans and Katy Börner
Scientometrics, 64 (3), 351-374

This paper presents a new map representing the structure of all of science, based on journal articles, including both the natural and social sciences. Similar to cartographic maps of our world, the map of science provides a bird’s eye view of today’s scientific landscape. It can be used to visually identify major areas of science, their size, similarity, and interconnectedness. In order to be useful, the map needs to be accurate on a local and on a global scale. While our recent work has focused on the former aspect,1 this paper summarizes results on how to achieve structural accuracy. Eight alternative measures of journal similarity were applied to a data set of 7,121 journals covering over 1 million documents in the combined Science Citation and Social Science Citation Indexes. For each journal similarity measure we generated two-dimensional spatial layouts using the force-directed graph layout tool, VxOrd …

Scholarly networks on resilience, vulnerability and adaptation within the human dimensions of global environmental change (2006)
Marco A Janssen, Michael L Schoon, Weimao Ke and Katy Börner
Global environmental change, 16 (3), 240-252

This paper presents the results of a bibliometric analysis of the knowledge domains resilience, vulnerability and adaptation within the research activities on human dimensions of global environmental change. We analyzed how 2286 publications between 1967 and 2005 are related in terms of co-authorship relations, and citation relations.The number of publications in the three knowledge domains increased rapidly between 1995 and 2005. However, the resilience knowledge domain is only weakly connected with the other two domains in terms of co-authorships and citations. The resilience knowledge domain has a background in ecology and mathematics with a focus on theoretical models, while the vulnerability and adaptation knowledge domains have a background in geography and natural hazards research with a focus on case studies and climate change research. There is an increasing number of cross …

Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of the literature (2011)
Caroline S Wagner, J David Roessner, Kamau Bobb, Julie Thompson Klein, Kevin W Boyack, Joann Keyton ...
Journal of informetrics, 5 (1), 14-26

Interdisciplinary scientific research (IDR) extends and challenges the study of science on a number of fronts, including creating output science and engineering (S&E) indicators. This literature review began with a narrow search for quantitative measures of the output of IDR that could contribute to indicators, but the authors expanded the scope of the review as it became clear that differing definitions, assessment tools, evaluation processes, and measures all shed light on different aspects of IDR. Key among these broader aspects is (a) the importance of incorporating the concept of knowledge integration, and (b) recognizing that integration can occur within a single mind as well as among a team. Existing output measures alone cannot adequately capture this process. Among the quantitative measures considered, bibliometrics (co-authorships, co-inventors, collaborations, references, citations and co-citations) are …

Network science (2007)
Katy Börner, Soma Sanyal and Alessandro Vespignani
Annual review of information science and technology, 41 (1), 537-607

This chapter reviews the highly interdisciplinary field of network science, which is concerned with the study of networks, be they biological, technological, or scholarly in character. It contrasts, compares, and integrates techniques and algorithms developed in disciplines as diverse as mathematics, statistics, physics, social network analysis, information science, and computer science. A coherent theoretical framework, including static and dynamical modeling approaches, is provided along with discussion of non-equilibrium techniques recently introduced for modeling growing networks. The chapter also provides a practical framework by reviewing major processes involved in the study of networks such as network sampling, measurement, modeling, validation, and visualization. For each of these processes, we explain and exemplify commonly used approaches. Aiming at a gentle yet formally correct introduction of …

Atlas of science: Visualizing what we know (2010)
Katy Börner
The MIT Press.

Cartographic maps have guided our explorations for centuries, allowing us to navigate the world. Science maps have the potential to guide our search for knowledge in the same way, allowing us to visualize scientific results. Science maps help us navigate, understand, and communicate the dynamic and changing structure of science and technology--help us make sense of the avalanche of data generated by scientific research today. Atlas of Science, featuring more than thirty full-page science maps, fifty data charts, a timeline of science-mapping milestones, and 500 color images, serves as a sumptuous visual index to the evolution of modern science and as an introduction to" the science of science"--charting the trajectory from scientific concept to published results. Atlas of Science, based on the popular exhibit," Places & Spaces: Mapping Science," describes and displays successful mapping techniques. The …

Mapping knowledge domains (2004)
Richard M Shiffrin and Katy Börner
National Academy of Sciences. 101 (1), 5183-5185

The term ‘‘mapping knowledge domains’’was chosen to describe a newly evolving interdisciplinary area of science aimed at the process of charting, mining, analyzing, sorting, enabling navigation of, and displaying knowledge. This field is aimed at easing information access, making evident the structure of knowledge, and allowing seekers of knowledge to succeed in their endeavors. Although thousands of years old, this area has undergone a sea change in the last 15 years, a change fostered by an explosion of the amount of information available, the accessibility of that information due to electronic storage, and the new techniques of analysis, retrieval, and visualization that are made possible by vast increases in computational storage capacity and processing speed and power. Many of us are so involved in the new ways of accessing knowledge that we have forgotten how recent is the change to computerized …

The simultaneous evolution of author and paper networks (2004)
Katy Börner, Jeegar T Maru and Robert L Goldstone
Proceedings of the National Academy of Sciences, 101 (suppl 1), 5266-5273

There has been a long history of research into the structure and evolution of mankind9s scientific endeavor. However, recent progress in applying the tools of science to understand science itself has been unprecedented because only recently has there been access to high-volume and high-quality data sets of scientific output (e.g., publications, patents, grants) and computers and algorithms capable of handling this enormous stream of data. This article reviews major work on models that aim to capture and recreate the structure and dynamics of scientific evolution. We then introduce a general process model that simultaneously grows coauthor and paper citation networks. The statistical and dynamic properties of the networks generated by this model are validated against a 20-year data set of articles published in PNAS. Systematic deviations from a power law distribution of citations to papers are well fit by a model …

Mapping topics and topic bursts in PNAS (2004)
Ketan K Mane and Katy Börner
Proceedings of the National Academy of Sciences, 101 (suppl 1), 5287-5290

Scientific research is highly dynamic. New areas of science continually evolve; others gain or lose importance, merge, or split. Due to the steady increase in the number of scientific publications, it is hard to keep an overview of the structure and dynamic development of one9s own field of science, much less all scientific domains. However, knowledge of “hot” topics, emergent research frontiers, or change of focus in certain areas is a critical component of resource allocation decisions in research laboratories, governmental institutions, and corporations. This paper demonstrates the utilization of Kleinberg9s burst detection algorithm, co-word occurrence analysis, and graph layout techniques to generate maps that support the identification of major research topics and trends. The approach was applied to analyze and map the complete set of papers published in PNAS in the years 1982-2001. Six domain experts …

Studying the emerging global brain: Analyzing and visualizing the impact of co‐authorship teams (2005)
Katy Börner, Luca Dall'Asta, Weimao Ke and Alessandro Vespignani
Complexity, 10 (4), 57-67

This article introduces a suite of approaches and measures to study the impact of co‐authorship teams based on the number of publications and their citations on a local and global scale. In particular, we present a novel weighted graph representation that encodes coupled author‐paper networks as a weighted co‐authorship graph. This weighted graph representation is applied to a dataset that captures the emergence of a new field of science and comprises 614 articles published by 1036 unique authors between 1974 and 2004. To characterize the properties and evolution of this field, we first use four different measures of centrality to identify the impact of authors. A global statistical analysis is performed to characterize the distribution of paper production and paper citations and its correlation with the co‐authorship team size. The size of co‐authorship clusters over time is examined. Finally, a novel local, author …

Analyzing and visualizing the semantic coverage of Wikipedia and its authors (2007)
Todd Holloway, Miran Bozicevic and Katy Börner
Complexity, 12 (3), 30-40

This article presents a novel analysis and visualization of English Wikipedia data. Our specific interest is the analysis of basic statistics, the identification of the semantic structure and the age of the categories in this free online encyclopedia, and the content coverage of its highly productive authors. © 2007 Wiley Periodicals, Inc. Complexity: 12: 30–40, 2007

A multi-level systems perspective for the science of team science (2010)
Katy Börner, Noshir Contractor, Holly J Falk-Krzesinski, Stephen M Fiore, Kara L Hall, Joann Keyton ...
Science Translational Medicine, 2 (49), 1-5

This Commentary describes recent research progress and professional developments in the study of scientific teamwork, an area of inquiry termed the “science of team science” (SciTS, pronounced “sahyts”). It proposes a systems perspective that incorporates a mixed-methods approach to SciTS that is commensurate with the conceptual, methodological, and translational complexities addressed within the SciTS field. The theoretically grounded and practically useful framework is intended to integrate existing and future lines of SciTS research to facilitate the field’s evolution as it addresses key challenges spanning macro, meso, and micro levels of analysis.

Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches (2011)
Kevin W Boyack, David Newman, Russell J Duhon, Richard Klavans, Michael Patek, Joseph R Biberstine ...
PloS one, 6 (3), 1-11

Background We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the …

Calling on a million minds for community annotation in WikiProteins (2008)
Barend Mons, Michael Ashburner, Christine Chichester, Erik van Mulligen, Marc Weeber, Johan den Dunnen ...
Genome biology, 9 (5), R89

WikiProteins enables community annotation in a Wiki-based system. Extracts of major data sources have been fused into an editable environment that links out to the original sources. Data from community edits create automatic copies of the original data. Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. In addition, indirect associations between concepts have been calculated. We call on a 'million minds' to annotate a 'million concepts' and to collect facts from the literature with the reward of collaborative knowledge discovery. The system is available for beta testing at http://www.wikiprofessional.org .

Design and update of a classification system: The UCSD map of science (2012)
Katy Börner, Richard Klavans, Michael Patek, Angela M Zoss, Joseph R Biberstine, Robert P Light ...
PloS one, 7 (7), e39464

Global maps of science can be used as a reference system to chart career trajectories, the location of emerging research frontiers, or the expertise profiles of institutes or nations. This paper details data preparation, analysis, and layout performed when designing and subsequently updating the UCSD map of science and classification system. The original classification and map use 7.2 million papers and their references from Elsevier’s Scopus (about 15,000 source titles, 2001–2005) and Thomson Reuters’ Web of Science (WoS) Science, Social Science, Arts & Humanities Citation Indexes (about 9,000 source titles, 2001–2004)–about 16,000 unique source titles. The updated map and classification adds six years (2005–2010) of WoS data and three years (2006–2008) from Scopus to the existing category structure–increasing the number of source titles to about 25,000. To our knowledge, this is the first time that a widely used map of science was updated. A comparison of the original 5-year and the new 10-year maps and classification system show (i) an increase in the total number of journals that can be mapped by 9,409 journals (social sciences had a 80% increase, humanities a 119% increase, medical (32%) and natural science (74%)), (ii) a simplification of the map by assigning all but five highly interdisciplinary journals to exactly one discipline, (iii) a more even distribution of journals over the 554 subdisciplines and 13 disciplines when calculating the coefficient of variation, and (iv) a better reflection of journal clusters when compared with paper-level citation data. When evaluating the map with a listing of desirable features for maps of …

Edit your profile