This paper is the second part of an introduction to Chiron, a full-stack software framework for automated metrical analysis, applicable to any language, metrical tradition and digital formats. This part focuses on data interpretation, which follows their collection as discussed in the first part. Whereas the analysis process collects data modeled in text units, segments, and traits, and is typically performed once, data observation and interpretation are unlimited, open to any working hypothesis, purpose and numerical treatment, including very powerful and complex, machine-learning driven techniques. Once the metric database has been filled with data, users can make any number and type of observations on each analyzed verse, each being defined in an independent and composable software module, the observer. The observations data models and their types provide a concrete example of the quality and quantity of data added to the database at this stage, covering language- independent or specific phenomena, from the lowest to the highest granularity level, for any imagined phenomenon we may want to observe. Such data are laid out in a multidimensional space, varying according to the phenomenon being handled. Once stored, these data can be variously aggregated, either inside and outside the database, and exported for numeric processing using standard data-science tools and techniques. A very simple example of this processing is provided, showing how the availability of a higher order of magnitude in the collected data (several million records) and the formalization of the collection method allow for leveraging such powerful tools, which may open new scenarios in metric and linguistic research.

Introducing Chiron, a Full-Stack Framework for Metrical Analysis: Part 2 –Data Interpretation

Daniele Fusi
2021-01-01

Abstract

This paper is the second part of an introduction to Chiron, a full-stack software framework for automated metrical analysis, applicable to any language, metrical tradition and digital formats. This part focuses on data interpretation, which follows their collection as discussed in the first part. Whereas the analysis process collects data modeled in text units, segments, and traits, and is typically performed once, data observation and interpretation are unlimited, open to any working hypothesis, purpose and numerical treatment, including very powerful and complex, machine-learning driven techniques. Once the metric database has been filled with data, users can make any number and type of observations on each analyzed verse, each being defined in an independent and composable software module, the observer. The observations data models and their types provide a concrete example of the quality and quantity of data added to the database at this stage, covering language- independent or specific phenomena, from the lowest to the highest granularity level, for any imagined phenomenon we may want to observe. Such data are laid out in a multidimensional space, varying according to the phenomenon being handled. Once stored, these data can be variously aggregated, either inside and outside the database, and exported for numeric processing using standard data-science tools and techniques. A very simple example of this processing is provided, showing how the availability of a higher order of magnitude in the collected data (several million records) and the formalization of the collection method allow for leveraging such powerful tools, which may open new scenarios in metric and linguistic research.
2021
63
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3725653
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact