Data physicalization

A data physicalization (or simply physicalization) is a physical artefact whose geometry or material properties encode data.[1] It has the main goals to engage people and to communicate data using computer-supported physical data representations.[2][3][4]

History

Before the invention of computers and digital devices, the application of data physicalization already existed in ancient artifacts as a medium to represent abstract information. One example is Blombo ocher plaque which is estimated to be 70000 – 80000 years old.[5] The geometric and iconographic shapes engraved at the surface of the artifact demonstrated the cognitive complexity of ancient humans. Moreover, since such representations were deliberately made and crafted, the evidences suggest that the geometric presentation of information is a popular methodology in the context of society.[6] Although researchers still cannot decipher the specific type of information encoded in the artifact, there are several proposed interpretations. For example, the potential functions of the artifact are divided into four categories, categorized as "numerical", "functional", "cognitive", and "social".[7] Later, at around 35,000 B.C, another artifact, the Lebombo bone, emerged and the encoded information became easier to read. There are around 29 distinct notches carved on the baboon fibula. It is estimated that the number of notches is closely related to the number of lunar cycles. Moreover, this early counting system was also regarded as the birth of calculation.[8]

Right before the invention of writing, the clay token system was spread across ancient Mesopotamia. When the buyers and sellers want to make a trade, they prepare a set of tokens and seal them inside the clay envelope after impressing the shape on the surface.[9] Such physical entity was widely used in trading, administrative documents, and agricultural settlement.[10] Moreover, the token system is evidence of the early counting system. Each shape corresponds to a physical meaning such as the representation of "sheep", forming a one-to-one mapping relationship. The significance of the token is it uses physical shape to encode numerical information[11] and it is regarded as the precursor of the early writing system.[12] The logical reason is the two-dimension symbol would record the same information as the impression created by the clay token.[9]

From 3000 BCE to the 17th century, a more complex visual encoding, Quipus, was developed and widely used in Andean South America.[13] Knotted strings unrelated to quipu have also been used to record information by the ancient Chinese, Tibetans and Japanese.[14][15][16][17] The ancient Inca empire used it for military and taxation purposes.[18] The Base-10 logical-numerical system can record information based on the relative distance of knots, the color of the knots, and the type of knots. Due to the texture (cotton) of Quipus, very few of them survive. By analyzing those remaining artifacts, Erland Nordenskiöld[19] proposed that Quipus is the only writing system used by Inca, and the information encoding technique is sophisticated and distinctive.[20]

The idea of data physicalization become popular since the 17th century in which architects and engineers widely used such methods in civil engineering and city management. For example, from 1663 to 1867, Plan-relief model was used to visualize French territorial structure and important military units such as citadels and walled cities. Therefore, one of the functions of the Plan-relief model was to plan defense or offense. It is worth noting that the model can be categorized as a military technology and it did not encode any abstract information.[21] The tradition of using tangible models to represent buildings and architectures still remains today.

One of the contemporary examples of data physicalization is the Galton board designed by Francis Galton who promoted the concept of Regression toward the mean. The Galton board, a very useful tool in approximating the Gaussian law of errors, consists of evenly spaced nails and vertical slats at the bottom of the board. After a large number of marbles are released, they will settle down at the bottom, forming the contour of a Bell Curve. Most marbles will agglomerate at the center (smaller deviation) with few on the edge of the board.[22]

In 1935, three different electricity companies (e.g. Pacific Gas and Electric Company, Commonwealth Edison Company) created an electricity data physicalization model to visualize the power consumption of their customers so that the company can better forecast the upcoming power demand.[23] The model has one short axis and one long axis. The short axis indicates "day", whereas the long axis spans the whole year.[24] The viewers can gain perspective on when customers consume electricity the most during the day and how does the consumption change across different seasons.[24] The model was built manually by cutting wooden sheets and stacked all pieces together.

Researchers began to realize that data physicalization models can not only help agents manage/plan certain tasks, but also can greatly simplify very complex problems by letting users manipulate data in the real world. Therefore, from an epistemic perspective, physical manipulation enables users to uncover hidden patterns that cannot be easily detected.[25]Max Perutz received Nobel Prize in Chemistry in 1962 for his distinguished work in discovering the structure of the globular protein. When a narrow X-ray passes through the haemoglobin molecule, the diffraction pattern can review the inner structure of the atomic arrangements.[6] One of Perutz's works within this research involved creating a physicalized haemoglobin molecule which enables him to manipulate and inspect the structure in a tangible way.

In the book, Bertin designed a matrices visualization device called Domino which let users manipulate row and column data. The combination of row and column can be considered as a two-dimensional data space. In Semiology of Graphics, Bertain defined what variables can be reordered and what variables cannot. For example, time can be considered as a one direction variable. We should keep it in a natural order.[26] Compared with the aforementioned work, this model emphasized the visual thinking aspect of data physicalization and supports a variety of data types such as maps, matrices, and timelines. By adjusting the data entries, an analyst can find patterns inside the datasets and repeatedly use Domino on different datasets.[24]

More recent physicalization examples include using LEGO bricks to keep track of project progress. For example, people used LEGO to record their thesis writing progress. Users can use the LEGO board to set concrete steps before pushing to real publications such as data analysis, data collection, development, etc.[27] Another application involves using LEGO in bug tracking. For software engineers, keeping track of the issue of the code base is a crucial task and LEGO simplify this progress by physicalize the issues.[28]

A specific application of data physicalization involves building tactile maps for visually impaired people. Past example include using microcapsule paper to build tactile maps.[29] With the help of digital fabrication tool such as laser cutter, researchers in Fab Lab at RWTH Aachen University has used it to produce relief-based a tactile map to support visually impaired users. Some tangible user interface researchers combined TUI with tactile maps to render dynamic rendering and enhance collaboration between vision impaired people (e.g. FluxMarkers).[30]

References