Document in the context of Text categorization


Document in the context of Text categorization

Document Study page number 1 of 5

Play TriviaQuestions Online!

or

Skip to study material about Document in the context of "Text categorization"


⭐ Core Definition: Document

A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin Documentum, which denotes a "teaching" or "lesson": the verb doceō denotes "to teach". In the past, the word was usually used to denote written proof useful as evidence of a truth or fact. In the Computer Age, "document" usually denotes a primarily textual computer file, including its structure and format, e.g. fonts, colors, and images. Contemporarily, "document" is not defined by its transmission medium, e.g., paper, given the existence of electronic documents. "Documentation" is distinct because it has more denotations than "document". Documents are also distinguished from "realia", which are three-dimensional objects that would otherwise satisfy the definition of "document" because they memorialize or represent thought; documents are considered more as two-dimensional representations. While documents can have large varieties of customization, all documents can be shared freely and have the right to do so, creativity can be represented by documents, also. History, events, examples, opinions, stories etc. all can be expressed in documents.

↓ Menu
HINT:

In this Dossier

Document in the context of Secondary sources

In scholarship, a secondary source is a document or recording that relates or discusses information originally presented elsewhere. A secondary source contrasts with a primary, or original, source of the information being discussed. A primary source can be a person with direct knowledge of a situation or it may be a document created by such a person.

A secondary source is one that gives information about a primary source. In a secondary source, the original information is selected, modified and arranged in a suitable format. Secondary sources involve generalization, analysis, interpretation, or evaluation of the original information.

View the full Wikipedia page for Secondary sources
↑ Return to Menu

Document in the context of Primary sources

In the study of history as an academic discipline, a primary source (also called an original source) is an artifact, document, diary, manuscript, autobiography, recording, or any other source of information that was created at the time under study. It serves as an original source of information about the topic. Similar definitions can be used in library science and other areas of scholarship, although different fields have somewhat different definitions.In journalism, a primary source can be a person with direct knowledge of a situation, or a document written by such a person.

Primary sources are distinguished from secondary sources, which cite, comment on, or build upon primary sources. Generally, accounts written after the fact with the benefit of hindsight are secondary. A secondary source may also be a primary source depending on how it is used. For example, a memoir would be considered a primary source in research concerning its author or about their friends characterized within it, but the same memoir would be a secondary source if it were used to examine the culture in which its author lived. "Primary" and "secondary" should be understood as relative terms, with sources categorized according to specific historical contexts and what is being studied.

View the full Wikipedia page for Primary sources
↑ Return to Menu

Document in the context of Legal instrument

Legal instrument is a legal term of art that is used for any formally executed written document that can be formally attributed to its author, records and formally expresses a legally enforceable act, process, or contractual duty, obligation, or right, and therefore evidences that act, process, or agreement. Examples include a certificate, deed, bond, contract, will, legislative act, notarial act, court writ or process, or any law passed by a competent legislative body in domestic or international law. Many legal instruments were written under seal by affixing a wax or paper seal to the document in evidence of its legal execution and authenticity (which often removed the need for consideration in contract law). However, today, many jurisdictions have abolished the requirement for documents to be under seal in order for them to have legal effect.

View the full Wikipedia page for Legal instrument
↑ Return to Menu

Document in the context of Non-fiction

Non-fiction (or nonfiction) is any document or media content that attempts, in good faith, to convey information only about the real world, rather than being grounded in imagination. Non-fiction typically aims to present topics objectively based on historical, scientific, and empirical information. However, some non-fiction ranges into more subjective territory, including sincerely held opinions on real-world topics.

Often referring specifically to prose writing, non-fiction is one of the two fundamental approaches to story and storytelling, in contrast to narrative fiction, which is largely populated by imaginary characters and events. Non-fiction writers can show the reasons and consequences of events, they can compare, contrast, classify, categorise and summarise information, put the facts in a logical or chronological order, infer and reach conclusions about facts, etc. They can use graphic, structural and printed appearance features such as pictures, graphs or charts, diagrams, flowcharts, summaries, glossaries, sidebars, timelines, table of contents, headings, subheadings, bolded or italicised words, footnotes, maps, indices, labels, captions, etc. to help readers find information.

View the full Wikipedia page for Non-fiction
↑ Return to Menu

Document in the context of Inscription

Epigraphy (from Ancient Greek ἐπιγραφή (epigraphḗ) 'inscription') is the study of inscriptions, or epigraphs, as writing; it is the science of identifying graphemes, clarifying their meanings, classifying their uses according to dates and cultural contexts, and drawing conclusions about the writing and the writers. Specifically excluded from epigraphy are the historical significance of an epigraph as a document and the artistic value of a literary composition. A person using the methods of epigraphy is called an epigrapher or epigraphist. For example, the Behistun inscription is an official document of the Achaemenid Empire engraved on native rock at a location in Iran. Epigraphists are responsible for reconstructing, translating, and dating the trilingual inscription and finding any relevant circumstances. It is the work of historians, however, to determine and interpret the events recorded by the inscription as document. Often, epigraphy and history are competences practised by the same person. Epigraphy is a primary tool of archaeology when dealing with literate cultures. The US Library of Congress classifies epigraphy as one of the auxiliary sciences of history. Epigraphy also helps identify a forgery: epigraphic evidence formed part of the discussion concerning the James Ossuary.

An epigraph (not to be confused with epigram) is any sort of text, from a single grapheme (such as marks on a pot that abbreviate the name of the merchant who shipped commodities in the pot) to a lengthy document (such as a treatise, a work of literature, or a hagiographic inscription). Epigraphy overlaps other competences such as numismatics or palaeography. When compared to books, most inscriptions are short. The media and the forms of the graphemes are diverse: engravings in stone or metal, scratches on rock, impressions in wax, embossing on cast metal, cameo or intaglio on precious stones, painting on ceramic or in fresco. Typically the material is durable, but the durability might be an accident of circumstance, such as the baking of a clay tablet in a conflagration.

View the full Wikipedia page for Inscription
↑ Return to Menu

Document in the context of Library

A library is a collection of books, and possibly other materials and media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or digital (soft copies) materials, and may be a physical location, a virtual space, or both. A library's collection normally includes printed materials which can be borrowed, and usually also includes a reference section of publications which may only be utilized inside the premises. Resources such as commercial releases of films, television programmes, other video recordings, radio, music and audio recordings may be available in many formats. These include DVDs, Blu-rays, CDs, cassettes, or other applicable formats such as microform. They may also provide access to information, music or other content held on bibliographic databases. In addition, some libraries offer creation stations for makers which offer access to a 3D printing station with a 3D scanner.

Libraries can vary widely in size and may be organised and maintained by a public body such as a government, an institution (such as a school or museum), a corporation, or a private individual. In addition to providing materials, libraries also provide the services of librarians who are trained experts in finding, selecting, circulating and organising information while interpreting information needs and navigating and analysing large amounts of information with a variety of resources. The area of study is known as library and information science or studies.

View the full Wikipedia page for Library
↑ Return to Menu

Document in the context of Digitizing

Digitization is the process of converting information into a digital (i.e. computer-readable) format. The result is the representation of an object, image, sound, document, or signal (usually an analog signal) obtained by generating a series of numbers that describe a discrete set of points or samples. The result is called digital representation or, more specifically, a digital image, for the object, and digital form, for the signal. In modern practice, the digitized data is in the form of binary numbers, which facilitates processing by digital computers and other operations, but digitizing simply means "the conversion of analog source material into a numerical format"; the decimal or any other number system can be used instead.

Digitization is of crucial importance to data processing, storage, and transmission, because it "allows information of all kinds in all formats to be carried with the same efficiency and also intermingled." Though analog data is typically more stable, digital data has the potential to be more easily shared and accessed and, in theory, can be propagated indefinitely without generation loss, provided it is migrated to new, stable formats as needed. This potential has led to institutional digitization projects designed to improve access and the rapid growth of the digital preservation field.

View the full Wikipedia page for Digitizing
↑ Return to Menu

Document in the context of Textual scholarship

Textual scholarship (or textual studies) is an umbrella term for disciplines that deal with describing, transcribing, editing, or annotating texts and physical documents. It examines how texts are produced, transmitted, and transformed over time, considering both their material and linguistic forms. From the recovery and collation of manuscript witnesses to the creation of critical or digital editions that reflect textual variation and editorial decision-making, textual scholarship encompasses a wide range of practices. The field integrates traditional philological methods with contemporary digital approaches, bridging historical textual criticism and modern data-driven humanities research.

View the full Wikipedia page for Textual scholarship
↑ Return to Menu

Document in the context of Reference works

A reference work is a document, such as a paper, book or periodical (or their electronic equivalents), to which one can refer for information. The information is intended to be found quickly when needed. Such works are usually referred to for particular pieces of information, rather than read beginning to end. The writing style used in these works is informative; the authors avoid opinions and the use of the first person, and emphasize facts.

Indices are a common navigation feature in many types of reference works. Many reference works are put together by a team of contributors whose work is coordinated by one or more editors, rather than by an individual author. Updated editions are usually published as needed, in some cases annually, such as Whitaker's Almanack, and Who's Who.

View the full Wikipedia page for Reference works
↑ Return to Menu

Document in the context of Part (music)

A part in music refers to a component of a musical composition. Because there are multiple ways to separate these components, there are several contradictory senses in which the word "part" is used:

  • any individual melody (or voice), whether vocal or instrumental, that can be abstracted as continuous and independent from other notes being performed simultaneously in polyphony. Within the music played by a single pianist, one can often identify outer parts (the top and bottom parts) or an inner part (those in between). On the other hand, within a choir, "outer parts" and "inner parts" would refer to music performed by different singers. (See § Polyphony and part-writing)
  • the musical instructions for any individual instrument or voice (often given as a handwritten, printed, or digitized document) of sheet music (as opposed to the full score which shows all parts of the ensemble in the same document). A musician's part usually does not contain instructions for the other players in the ensemble, only instructions for that individual.
  • the music played by any group of musicians who all perform together for a given piece; in a symphony orchestra, a dozen or more cello players may all play "the same part" even if they each have their own physical copy of the music. This part may be in unison or may be harmonized, and may even sometimes contain counter-melodies within it. A percussion part may sometimes only contain rhythm. This sense of "part" does not require a written copy of the music; a bass player in a rock band "plays the bass part" even if there is no written version of the song.
  • a section in the large-scale form of a piece. (See § Musical form)
View the full Wikipedia page for Part (music)
↑ Return to Menu

Document in the context of Cataloging

In library and information science, cataloging (US) or cataloguing (UK) is the process of creating metadata representing information resources, such as books, sound recordings, moving images, etc. Cataloging provides information such as author's names, titles, and subject terms that describe resources, typically through the creation of bibliographic records. The records serve as surrogates for the stored information resources. Since the 1970s these metadata are in machine-readable form and are indexed by information retrieval tools, such as bibliographic databases or search engines. While typically the cataloging process results in the production of library catalogs, it also produces other types of discovery tools for documents and collections.

Bibliographic control provides the philosophical basis of cataloging, defining the rules that sufficiently describe information resources, and enable users to find and select the most appropriate resource. A cataloger is an individual responsible for the processes of description, subject analysis, classification, and authority control of library materials. Catalogers serve as the "foundation of all library service, as they are the ones who organize information in such a way as to make it easily accessible".

View the full Wikipedia page for Cataloging
↑ Return to Menu

Document in the context of Storage medium

Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are considered by some as data storage. Recording may be accomplished with virtually any form of energy. Electronic data storage requires electrical power to store and retrieve data.

Data storage in a digital, machine-readable medium is sometimes called digital data. Computer data storage is one of the core functions of a general-purpose computer. Electronic documents can be stored in much less space than paper documents. Barcodes and magnetic ink character recognition (MICR) are two ways of recording machine-readable data on paper.

View the full Wikipedia page for Storage medium
↑ Return to Menu

Document in the context of Text classification

Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification.

The documents to be classified may be texts, images, music, etc. Each kind of document possesses its special classification problems. When not otherwise specified, text classification is implied.

View the full Wikipedia page for Text classification
↑ Return to Menu

Document in the context of Content analysis

Content analysis is the study of documents and communication artifacts, including texts, photos, speeches, or essays. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner. One of the key advantages of using content analysis to analyse social phenomena is their non-invasive nature, in contrast to simulating social experiences or collecting survey answers.

Practices and philosophies of content analysis vary between academic disciplines. They all involve systematic reading or observation of texts or artifacts which are assigned labels (sometimes called codes) to indicate the presence of interesting, meaningful pieces of content. By systematically labeling the content of a set of texts, researchers can analyse patterns of content quantitatively using statistical methods, or use qualitative methods to analyse meanings of content within texts.

View the full Wikipedia page for Content analysis
↑ Return to Menu