Treasure trove of hidden discoveries
An archive refers to “a place where people can go to gather firsthand facts, data, and evidence from letters, reports, notes, memos, photographs, and other primary sources,” according to the US National Archives and Records Administration, which manages a vast amount of official government data with some 3,000 employees.
The US archive agency says all people also have personal archives, or a collection of material that records important events from their family’s history.
Thanks to digital technologies, the proliferation of personal and official data in cyberspace is accelerating at a breakneck pace. The use of digital archives is also on the rise. But what digital archives really are is often misunderstood.
Researchers in the field of digital humanities say that a simple digitalization of data -- a digital extension of the existing offline archives -- is far from enough. The primary task behind the build-up of digital archives is to provide a venue where basic data and resources are readily available online so that both the public and professional researchers can use them with ease and understand their social, cultural and historical contexts.
While traditional education takes place in separate offline spaces such as universities, museums and historical sites, today’s students and researchers are moving to digital platforms that offer a rapidly increasing storage of all sorts of archives, a mixture that appears to provide more chances to identify hidden relations among facts.
The Korean government, recognizing the implications of such digital archives, is helping establish a variety of online archives. For instance, the National Museum of Korean Contemporary History runs a digital archive that includes historically important photographs, documents and related information. It has 44 categories of data ranging from politics and culture to Japanese colonial rule and pro-democracy movements. The museum’s digital archive also provides what is called an “open application programming interface” that allows users and third-party companies to develop application programs and services based on its database.
The need for digitalization is not limited to tangible documents and assets. Intangible data such as traditional songs are likely to get lost over time without special care. The Intangible Heritage Digital Archive is an example of a digital archive specializing in the preservation of intangible cultural assets such as traditional performing arts, music albums and festivals. Semantic web and its challenges
In constructing digital archives, the term “semantic web,” coined by Tim Berners-Lee, who is also the inventor of the World Wide Web, is often mentioned. The term refers to an envisioned system that enables machines to readily “understand” data and respond to complex human requests based on their meanings. To achieve this, the relevant information resources should be semantically structured. An intricate and logical design of ontology -- a core structure to link information systematically for the sematic web -- is also needed to ensure database operability and better knowledge management.
At the online forum, Kim Hyeon, a professor of cultural informatics at the Academy of Korean Studies, said during the discussion session that there are a host of challenges in building open and flexible archives. “If we use semantic-based search, we can get meaningful information in various ways,” said Kim. “However, the user interface is not friendly enough.”
Another problem, Kim said, is that even though the same word is used in a digital archive, people have different ideas about the term, whose actual meaning can diverge and then create conflictions in the database.
Kim, who leads a project to help recreate the ancient capital of Korea in cyberspace, said that 32 researchers on the project team map out their own ontology schemes and databases.
“The aim is eventually to interconnect all of the different databases, but a problem can rise when a single concept refers to two different definitions,” Kim said. “This type of problem cannot be resolved by the use of artificial intelligence.”
By Yang Sung-jin (email@example.com