Ändra sökning
Avgränsa sökresultatet
1 - 66 av 66
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1. Hagedorn, Joshua
    et al.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Bearing a Bag-of-Tales: An Open Corpus of Annotated Folktales for Reproducible Research2022Ingår i: Journal of Open Humanities Data, E-ISSN 2059-481X, Vol. 8, nr 16Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Motifs in folktales and myths have been identified and articulated by scholars, and the computational identification and discovery of such motifs is an area of ongoing research. Achieving this goal means meeting scientific requirements (that methods be comparable and replicable) and requirements for collaboration (that multi-disciplinary teams can reliably access data). To support those requirements, access to consistent reference datasets is needed. Unfortunately, these datasets are not openly available in a format that supports their use in data science. Here we report work in progress toward this goal, having converted the Ashliman Folktexts collection into a public dataset of annotated tale texts. The data can be accessed at doi.org/10.5281/zenodo.6575263.

    Ladda ner fulltext (pdf)
    fulltext
  • 2.
    Pastor-Sánchez, Juan-Antonio
    et al.
    University of Murcia, Murcia, Spain.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Kontopoulos, Efstratios
    Catalink Limited, Nicosia, Cyprus.
    Expressing Significant Others by Gravitation in the Ontology of Greek Mythology2022Ingår i: Metadata and Semantic Research: 15th International Conference, MTSR 2021, Virtual Event, November 29 – December 3, 2021, Revised Selected Papers / [ed] Emmanouel Garoufallou, María-Antonia Ovalle-Perandones, Andreas Vlachidis, 2022, s. 224-235Konferensbidrag (Refereegranskat)
    Abstract [en]

    To help close the gap between folksonomic knowledge vs. digital classical philology, based on a perceived analogy between Newtonian mechanics and evolving semantic spaces, we tested a new conceptual framework in a specific domain, the Ontology of Greek Mythology (OGM). The underlying Wikidata-based public dataset has 5377 entities with 289 types of relations, out of which 34 were used for its construction. To visualize the influence structure of a subset of 771 divine actors by other means than the force-directed placement of graph nodes, we expressed the combination of semantic relatedness plus objective vs. relative importance of these entities by their gravitational behaviour. To that end, the metaphoric equivalents of distance, mass, force, gravitational potential, and gravitational potential energy were applied, with the latter interpreted as the structuration capacity of nodes. The results were meaningful to the trained eye, but, given the very high number of contour maps and heatmaps available by our public tool, their systematic evaluation lies ahead.

  • 3.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Olson, Nasrine
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Lindell, Eva
    Högskolan i Borås, Akademin för textil, teknik och ekonomi.
    Persson, Nils-Krister
    Högskolan i Borås, Akademin för textil, teknik och ekonomi.
    Riga, Marina
    Kontopoulos, Efstratios
    Kompatsiaris, Ioannis
    Communicating Semantic Content to Persons with Deafblindness by Haptograms and Smart Textiles: Theoretical Approach and Methodology2020Ingår i: International Journal on Advances in Intelligent Systems, ISSN 1942-2679, E-ISSN 1942-2679, Vol. 13, nr 1&2, s. 103-113Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    By means of a proof-of-concept prototype, which is work in progress, we adopted a multidisciplinary approach to develop a smart-textile-based communication system for use by people with deafblindness. In this system, sensor technologies and computer vision are used to detect environmental cues such as presence of obstacles, faces, objects, etc. Focusing on the communication module here, a new ontology connects visual analytics with the user to label detected semantic content about objects, persons and situations for navigation and situational awareness. Such labelled content is then translated to a haptogram vocabulary with static vs. dynamic patterns, which are mapped to the body. A haptogram denotes a tactile symbol composed over a touchscreen, its dynamic nature referring to the act of writing or drawing. A vest made of smart textile, in the current variant equipped with a 4 x 4 grid of vibrotactile actuators, is used to transmit haptograms on the user’s back. Thereby system messages of different complexity -- both alerts and short sentences -- can be received by the user, who then has the option to respond by pre-coded questions and messages. By means of grids with more actuators, displays with higher resolution can be implemented and tested, paving the way for an extended haptogram vocabulary, covering more detailed ontology content.

  • 4.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Olson, Nasrine
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Riga, Marina
    Kontopoulos, Efstratios
    Kompatsiaris, Ioannis
    Static and Dynamic Haptograms to Communicate Semantic Content: Towards Enabling Face-to-Face Communication for People with Deafblindness2019Ingår i: ThinkMind// SEMAPRO, International Conference on Advances in Semantic Processing / [ed] Tim vor der Brück, Efstratios Kontopoulos, Porto, Portugal: International Academy, Research and Industry Association (IARIA), 2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    Based on the ontology developed in the ongoing SUITCEYES EU-funded project to bridge visual analytics for situational awareness and navigation with semantic labelling of environmental cues, we designed a set of static and dynamic haptograms to represent concepts for two-way communication between deafblind and non-deafblind users. A haptogram corresponds to a tactile symbol drawn over a touchscreen, its dynamic nature referring to the act of writing or drawing, where the touchscreen can take several forms, including a smart textile screen designated for specific areas on the body. In its current version, our haptogram set is generated over a 4 x 4 matrix of cells and is displayed on the back of the user, tested for robustness at the receiving end. The concepts and concept sequences simulating simple questions and answers represented by haptograms are focused on ontology content for now but can be scaled up.

  • 5.
    Kiraly, Laszlo
    et al.
    Cardiac Sciences, Sheikh Khalifa Medical City, Abu Dhabi.
    Kiraly, Balint
    Department of Biophysics and Radiation Biology, Semmelweis University, Budapest, Hungary.
    Szigeti, Krisztian
    Department of Biophysics and Radiation Biology, Semmelweis University, Budapest, Hungary.
    Tamas, Csaba Zsolt
    Gottsegen Hungarian Institute of Cardiology, Budapest, Hungary.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Virtual museum of congenital heart defects: digitization and establishment of a database for cardiac specimens.2019Ingår i: Quantitative imaging in medicine and surgery, ISSN 2223-4292, Vol. 9, nr 1, s. 115-126Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Education and training of morphology for medical students, and professionals specializing in pediatric cardiology and surgery has traditionally been based on hands-on encounter with congenitally malformed cardiac specimens. Large international archives are no longer widely available due to stricter data protection rules, a reduced number of autopsies, attrition rate of existing specimens, and most importantly due to a higher survival rate of patients. Our Cardiac Archive houses about 400 cardiac specimens with congenital heart disease. The collection spans almost 60 years and thus goes back to pre-surgical era. Unfortunately, attrition rate due to desiccation has led to an increased natural decay in recent years. The present multi-institutional project focuses on saving the collection by digitization. Specimens are scanned by high-resolution micro-CT/MRI. Virtual 3D-models are segmented and a comprehensive database is built. We now report an initial feasibility study with six test specimens that provided promising results, however, adequate presentation of the intracardiac anatomy, including septa and cardiac valves requires further refinements. Computer assisted design methods are necessary to overcome consequences of pathological examination, shrinkage and/or distortion of the specimens. For a next step, we anticipate an expandable web-based virtual museum with interactive reference and training tools. Web access for professional third parties will be provided by registration/subscription. In a future phase, segmental wall motion data could be added to virtual models. 3D-printed models may replace actual specimens and serve as hands-on surgical training to elucidate complex morphologies, promote surgical emulation, and extract more accurate procedural knowledge based on such a collection.

  • 6.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Konstantinidis, Konstantinos
    Papadopoulos, Symeon
    Kontopoulos, Efstratios
    A Physical Metaphor to Study Semantic Drift2016Ingår i: Proceedings of SuCCESS-16, 1st International Workshop on Semantic Change & Evolving Semantics, 2016, Vol. 1695Konferensbidrag (Refereegranskat)
    Abstract [en]

    In accessibility tests for digital preservation, over time we experience drifts of localized and labelled content in statistical models of evolving semantics represented as a vector field. This articulates the need to detect, measure, interpret and model outcomes of knowledge dynamics. To this end we employ a high-performance machine learning algorithm for the training of extremely large emergent self-organizing maps for exploratory data analysis. The working hypothesis we present here is that the dynamics of semantic drifts can be modeled on a relaxed version of Newtonian mechanics called social mechanics. By using term distances as a measure of semantic relatedness vs. their PageRank values indicating social importance and applied as variable ‘term mass’, gravitation as a metaphor to express changes in the semantic content of a vector field lends a new perspective for experimentation. From ‘term gravitation’ over time, one can compute its generating potential whose fluctuations manifest modifications in pairwise term similarity vs. social importance, thereby updating Osgood’s semantic differential. The dataset examined is the public catalog metadata of Tate Galleries, London.

    Ladda ner fulltext (pdf)
    fulltext
  • 7.
    Kontopoulos, E.
    et al.
    CERTH, Thessaloniki, Greece.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Konstantinidis, K.
    CERTH, Thessaloniki, Greece.
    Riga, M.
    CERTH, Thessaloniki, Greece.
    Mitzias, P.
    CERTH, Thessaloniki, Greece.
    Stavropoulos, T.
    CERTH, Thessaloniki, Greece.
    Andreadis, S.
    CERTH, Thessaloniki, Greece.
    Maronidis, A.
    CERTH, Thessaloniki, Greece.
    Karakostas, A.
    CERTH, Thessaloniki, Greece.
    Tachos, S.
    CERTH, Thessaloniki, Greece.
    Kaltsa, V.
    CERTH, Thessaloniki, Greece.
    Tsagiopoulu, M.
    CERTH, Thessaloniki, Greece.
    Avgerinakis, K.
    CERTH, Thessaloniki, Greece.
    Deliverable 4.5: Context-aware Content Interpretation2016Rapport (Refereegranskat)
    Abstract [en]

    The current deliverable summarises the work conducted within task T4.5 of WP4, presenting our proposed approaches for contextualised content interpretation, aimed at gaining insightful contextualised views on content semantics. This is achieved through the adoption of appropriate context-aware semantic models developed within the project, and via enriching the semantic descriptions with background knowledge, deriving thus higher level contextualised content interpretations that are closer to human perception and appraisal needs. More specifically, the main contributions of the deliverable are the following: A theoretical framework using physics as a metaphor to develop different models of evolving semantic content. A set of proof-of-concept models for semantic drifts due to field dynamics, introducing two methods to identify quantum-like (QL) patterns in evolving information searching behaviour, and a QL model akin to particle-wave duality for semantic content classification. Integration of two specific tools, Somoclu for drift detection and Ncpol2spda for entanglement detection. An “energetic” hypothesis accounting for contextualized evolving semantic structures over time. A proposed semantic interpretation framework, integrating (a) an ontological inference scheme based on Description Logics (DL), (b) a rule-based reasoning layer built on SPARQL Inference Notation (SPIN), (c) an uncertainty management framework based on non-monotonic logics. A novel scheme for contextualized reasoning on semantic drift, based on LRM dependencies and OWL’s punning mechanism. An implementation of SPIN rules for policy and ecosystem change management, with the adoption of LRM preconditions and impacts. Specific use case scenarios demonstrate the context under development and the efficiency of the approach. Respective open-source implementations and experimental results that validate all the above.All these contributions are tightly interlinked with the other PERICLES work packages: WP2 supplies the use cases and sample datasets for validating our proposed approaches, WP3 provides the models (LRM and Digital Ecosystem models) that form the basis for our semantic representations of content and context, WP5 provides the practical application of the technologies developed to preservation processes, while the tools and algorithms presented in this deliverable can be deployed in combination with test scenarios, which will be part of the WP6 test beds.

    Ladda ner fulltext (pdf)
    fulltext
  • 8.
    Waddington, Simon
    et al.
    King's College London, UK.
    Hedges, Mark
    King's College London, UK.
    Riga, Marina
    CERTH, Thessaloniki, Greece.
    Mitzias, Panagiotis
    CERTH, Thessaloniki, Greece.
    Kontopoulos, Efstratios
    CERTH, Thessaloniki, Greece.
    Kompatsiaris, Ioannis
    CERTH, Thessaloniki, Greece.
    Vion-Dury, Jean-Yves
    XRCE, Grenoble, France.
    Lagos, Nikolaos
    XRCE, Grenoble, France.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Corubolo, Fabio
    University of Liverpool, UK.
    Muller, Christian
    BUSOC, Belgium.
    McNeill, John
    Tate Galleries, London, UK.
    PERICLES – Digital Preservation through Management of Change in Evolving Ecosystems.2016Ingår i: The Success of European Projects Using New Information and Communication Technologies / [ed] Hamriouni, S., Setubal, Portugal, 2016, s. 51-74Konferensbidrag (Refereegranskat)
    Abstract [en]

    Management of change is essential to ensure the long-term reusabilityof digital assets. Change can be brought about in many ways, includingthrough technological, user community and policy factors. Motivated by casestudies in space science and time-based media, we consider the impact ofchange on complex digital objects comprising multiple interdependent entities,such as files, software and documentation. Our approach is based on modellingof digital ecosystems, in which abstract representations are used to assess risksto sustainability and support tasks such as appraisal. The paper is based onwork of the EU FP7 PERICLES project on digital preservation, and presentssome general concepts as well as a description of selected research areas underinvestigation by the project.

    Ladda ner fulltext (pdf)
    fulltext
  • 9.
    Maronidis, A.
    et al.
    CERTH, Thessaloniki, Greece.
    Chatzilari, E.
    CERTH, Thessaloniki, Greece.
    Kontopoulos, E.
    CERTH, Thessaloniki, Greece.
    Nikopoulos, S.
    CERTH, Thessaloniki, Greece.
    Riga, M.
    CERTH, Thessaloniki, Greece.
    Mitzias, P.
    CERTH, Thessaloniki, Greece.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Gill, A.
    King's College London, UK.
    Tonkin, E.L.
    King's College London, UK.
    De Weerdt, D.
    SpaceApps, Belgium.
    Corubolo, F.
    University of Liverpool, UK.
    Waddington, S.
    King's College London, UK.
    Sauter, Ch.
    King's College London, UK.
    PERICLES Deliverable 4.3: Content Semantics and Use Context Analysis Techniques2016Rapport (Refereegranskat)
    Abstract [en]

    The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.

    Ladda ner fulltext (pdf)
    fulltext
  • 10.
    Kontopoulos, Efstratios
    et al.
    CERTH, Thessaloniki, Greece.
    Riga, Marina
    CERTH, Thessaloniki, Greece.
    Mitzias, P.
    CERTH, Thessaloniki, Greece.
    Andreadis, S.
    CERTH, Thessaloniki, Greece.
    Stavropoulos, T.
    CERTH, Thessaloniki, Greece.
    Konstantinidis, K.
    CERTH, Thessaloniki, Greece.
    Maronidis, A.
    CERTH, Thessaloniki, Greece.
    Karakostas, A.
    CERTH, Thessaloniki, Greece.
    Tachos, S.
    CERTH, Thessaloniki, Greece.
    Kaltsa, V.
    CERTH, Thessaloniki, Greece.
    Tsagiopoulu, M.
    CERTH, Thessaloniki, Greece.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Gill, A.
    King's College London, UK.
    Tonkin, E. L.
    King's College London, UK.
    Waddington, S.
    King's College London, UK.
    Sauter, Ch.
    King's College London, UK.
    Corubolo, F.
    University of Liverpool, UK.
    PERICLES Deliverable 4.4: Modelling Contextualised Semantics2016Rapport (Refereegranskat)
    Abstract [en]

    The current deliverable summarises the work conducted within task T4.4 of WP4, presenting our proposed models for semantically representing digital content and its respective context – the latter refers to any information coming from the environment of the digital object (DO) that offers a better insight into the object’s status, its  interrelationships with other content items and information about the object’s context of use. Within PERICLES, we refer to the content semantics enriched with the contextual perspective as “contextualised semantics”. The deliverable presents two complementary modelling approaches, based respectively on (a) ontologies and logics, and, (b) multivariate statistics.

    Ladda ner fulltext (pdf)
    fulltext
  • 11.
    Wittek, Peter
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Liu, Ying-Hsang
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Gedeon, Tom
    Lim, Ik Soo
    Risk and Ambiguity in Information Seeking: Eye Gaze Patterns Reveal Contextual Behaviour in Dealing with Uncertainty2016Ingår i: Frontiers in Psychology, E-ISSN 1664-1078, Vol. 7Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Information foraging connects optimal foraging theory in ecology with how humans search for information. The theory suggests that, following an information scent, the information seeker must optimize the tradeoff between exploration by repeated steps in the search space vs. exploitation, using the resources encountered. We conjecture that this tradeoff characterizes how a user deals with uncertainty and its two aspects, risk and ambiguity in economic theory. Risk is related to the perceived quality of the actually visited patch of information, and can be reduced by exploiting and understanding the patch to a better extent. Ambiguity, on the other hand, is the opportunity cost of having higher quality patches elsewhere in the search space. The aforementioned tradeoff depends on many attributes, including traits of the user: at the two extreme ends of the spectrum, analytic and wholistic searchers employ entirely different strategies. The former type focuses on exploitation first, interspersed with bouts of exploration, whereas the latter type prefers to explore the search space first and consume later. Based on an eye-tracking study of experts’ interactions with novel search interfaces in the biomedical domain, we demonstrate that perceived risk shifts the balance between exploration and exploitation in either type of users, tilting it against vs. in favour of ambiguity minimization. Since the pattern of behaviour in information foraging is quintessentially sequential, risk and ambiguity minimization cannot happen simultaneously, leading to a fundamental limit on how good such a tradeoff can be. This in turn connects information seeking with the emergent field of quantum decision theory.

    Ladda ner fulltext (pdf)
    fulltext
  • 12.
    Wittek, Peter
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Nelhans, Gustaf
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Ruling out static latent homophily in citation networks2016Ingår i: Scientometrics, ISSN 0138-9130, E-ISSN 1588-2861, Vol. 110, nr 2, s. 765-777Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Citation and coauthor networks offer an insight into the dynamics of scientific progress. We can also view them as representations of a causal structure, a logical process captured in a graph. From a causal perspective, we can ask questions such as whether authors form groups primarily due to their prior shared interest, or if their favourite topics are ‘contagious’ and spread through co-authorship. Such networks have been widely studied by the artificial intelligence community, and recently a connection has been made to nonlocal correlations produced by entangled particles in quantum physics—the impact of latent hidden variables can be analyzed by the same algebraic geometric methodology that relies on a sequence of semidefinite programming (SDP) relaxations. Following this trail, we treat our sample coauthor network as a causal graph and, using SDP relaxations, rule out latent homophily as a manifestation of prior shared interest only, leading to the observed patternedness. By introducing algebraic geometry to citation studies, we add a new tool to existing methods for the analysis of content-related social influences.

    Ladda ner fulltext (pdf)
    fulltext
  • 13. Kontopoulos, Efstratios
    et al.
    Moysiadis, Theodoros
    Tsagiopoulou, Maria
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Papakonstantinou, Nikos
    Ntoufa, Stavroula
    Meditskos, Georgios
    Stamatopoulos, Kostas
    Kompatsiaris, Ioannis
    Studying the Cohesion Evolution of Genes Related to Chronic Lymphocytic Leukemia Using Semantic Similarity in Gene Ontology and Self-Organizing Maps2016Ingår i: Proceedings of SWAT4LS-16, 9th International Conference on Semantic Web Applications and Tools for Life Sciences, 2016Konferensbidrag (Refereegranskat)
    Abstract [en]

    A significant body of work on biomedical text mining is aimed at uncovering meaningful associations between biological entities, including genes. This has the potential to offer new insights for research, uncovering hidden links between genes involved in critical pathways and processes. Recently, high-throughput studies have started to unravel the genetic landscape of chronic lymphocytic leukemia (CLL), the most common adult leukemia. CLL displays remarkable clinical heterogeneity, likely reflecting its underlying biological heterogeneity which, despite all progress, still remains insufficiently characterized and understood. This paper deploys an ontology-based semantic similarity combined with self-organizing maps for studying the temporal evolution of cohesion among CLL-related genes and the extracted information. Three consecutive time periods are considered and groups of genes are derived therein. Our preliminary results indicated that our proposed gene groupings are meaningful and that the temporal dimension indeed impacted the gene cohesion, leaving a lot of room for further promising investigations.

    Ladda ner fulltext (pdf)
    fulltext
  • 14. Meroño Peñuela, Albert
    et al.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Visualizing the Drift of Linked Open Data Using Self-Organizing Maps2016Ingår i: Proceedings of Drift-a-LOD Workshop at the 20th International Conference on Knowledge Engineering and Knowledge Management, 2016Konferensbidrag (Refereegranskat)
    Abstract [en]

    The urge for evolving the Web into a globally shared dataspace has turned the Linked Open Data (LOD) cloud into a massive platform containing 100 billion machine-readable statements. Several factors hamper a historical study of the evolution of the LOD cloud, and hence forecasting its future: its ever-growing scale, which makes a global analysis difficult; its Web-distributed nature, which challenges the analysis of its data; and the scarcity of regular and time-stamped archival dumps. Recently, a scalable implementation of self-organizing maps (SOM) has been developed to visualize the local topology of high-dimensional data. We use this methodology to address scalability issues, and the Dynamic Linked Data Observatory, a regular biweekly, centralized sample of the LOD cloud, as a time-stamped collection. We visualize the drift of Linked Datasets between 2012 and 2016, finding that datasets with high availability, high vocabulary reuse, and modeling with commonly used terms in the LOD cloud are better traceable across time.

    Ladda ner fulltext (pdf)
    fulltext
  • 15.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Konstantinidis, K
    CERTH..
    Papadopoulos, S
    CERTH..
    A Potential Surface Underlying Meaning?2015Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Machine learning algorithms utilizing gradient descent to identify concepts or more general learnables hint at a so-far ignored possibility, namely that local and global minima represent any vocabulary as a landscape against which evaluation of the results can take place. A simple example to illustrate this idea would be a potential surface underlying gravitation. However, to construct a gravitation-based representation of, e.g., word meaning, only the distance between localized items is a given in the vector space, whereas the equivalents of mass or charge are unknown in semantics. Clearly, the working hypothesis that physical fields could be a useful metaphor to study word and sentence meaning is an option but our current representations are incomplete in this respect.For a starter, consider that an RBF kernel has the capacity to generate a potential surface and hence create the impression of gravity, providing one with distance-based decay of interaction strength, plus a scalar scaling factor for the interaction, but of course no term masses. We are working on an experiment design to change that. Therefore, with certain mechanisms in neural networks that could host such quasi-physical fields, a novel approach to the modeling of mind content seems plausible, subject to scrutiny.Work in progress in another direction of the same idea indicates that by using certain algorithms, already emerged vs. still emerging content is clearly distinguishable, in line with Aristotle’s Metaphysics. The implications are that a model completed by “term mass” or “term charge” would enable the computation of the specific work equivalent of sentences or documents, and that via replacing semantics by other modalities, vector fields of more general symbolic content could exist as well. Also, the perceived hypersurface generated by the dynamics of language use may be a step toward more advanced models, for example addressing the Hamiltonian of expanding semantic systems, or the relationship between reaction paths in quantum chemistry vs. sentence construction by gradient descent.

    Ladda ner fulltext (pdf)
    fulltext
  • 16.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Wittek, Peter
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Conceptual machinery of the mythopoetic mind: Attis, a case study2015Ingår i: Proceedings of QI-15, 9th International Quantum Interaction Symposium, 2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    In search for the right interpretation regarding a body of related content, we screened a small corpus of myths about Attis, a minor deity from the Hellenistic period in Asia Minor to identify the noncommutativity of key concepts used in storytelling. Looking at the protagonist's typical features, our experiment showed incompatibility with regard to his gender and downfall. A crosscheck for entanglement found no violation of a Bell inequality, its best approximation being on the border of the local polytope.

    Ladda ner fulltext (pdf)
    fulltext
  • 17.
    Malec, S.
    et al.
    University of Texas (Austin).
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Widdows, D.
    Microsoft Bing.
    Cohen, T.
    University of Texas (Austin).
    Landing Propp in Interaction Space: First Steps Toward Scalable Open Domain Narrative Analysis With Predication-based Semantic Indexing2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper, we explore the possibility of applying high-dimensionalvector representations of concept-relation-concept triplets, which have been successfullyapplied to model a small set of relationship types in the biomedicaldomain, to the task of modeling folk tales. In doing so, our ultimate aim is todevelop representations of narratives through which their underlying structurecan be compared. The current paper describes our progress toward this aim, withemphasis on addressing the technical challenges involved in moving from therelatively constrained set of relations that have been extracted from biomedicaltext to the much larger set of unnormalized relations that have been extractedfrom the open domain. A toy example using graded vectors demonstrates that ourapproach will be feasible once more material will be added to the test collection.

    Ladda ner fulltext (pdf)
    fulltext
  • 18.
    Wittek, Peter
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Kontopoulos, E.
    CERTH.
    Moysiadis, T.
    CERTH.
    Kompatsiaris, I.
    CERTH.
    Monitoring Term Drift Based on SemanticConsistency in an Evolving Vector Field2015Ingår i: Proceedings of IJCNN-15, 2015Konferensbidrag (Refereegranskat)
    Abstract [en]

    Based on the Aristotelian concept of potentialityvs. actuality allowing for the study of energy and dynamics inlanguage, we propose a field approach to lexical analysis. Fallingback on the distributional hypothesis to statistically model wordmeaning, we used evolving fields as a metaphor to express timedependentchanges in a vector space model by a combinationof random indexing and evolving self-organizing maps (ESOM).To monitor semantic drifts within the observation period, anexperiment was carried out on the term space of a collection of12.8 million Amazon book reviews. For evaluation, the semanticconsistency of ESOM term clusters was compared with theirrespective neighbourhoods in WordNet, and contrasted withdistances among term vectors by random indexing. We found thatat 0.05 level of significance, the terms in the clusters showed a highlevel of semantic consistency. Tracking the drift of distributionalpatterns in the term space across time periods, we found thatconsistency decreased, but not at a statistically significant level.Our method is highly scalable, with interpretations in philosophy.

    Ladda ner fulltext (pdf)
    fulltext
  • 19.
    Kontopoulos, E.
    et al.
    CERTH.
    Corubolo, F.
    University of Liverpool.
    Eggers, A.
    University of Göttingen.
    Ludwig, J.
    University of Göttingen.
    Wieder, P.
    GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen.
    Hedges, M.
    King's College London.
    Waddington, S.
    King's College London.
    Chanod, J-P.
    Xerox European Research Centre.
    Vion-Dury, J-Y.
    Xerox European Research Centre.
    Hasan, A.
    University of Liverpool.
    Watry, P.
    University of Liverpool.
    Darányi, Sándor
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT.
    Pinchuk, R.
    SpaceApps.
    Laurenson, P.
    Tate.
    Mueller, C.
    B.USOC.
    Spyroglou, O.
    Dotsoft.
    Kompatsiaris, i.
    CERTH.
    PERICLES EU Integrated Project: Research Strategy and First Results2015Ingår i: Proceedings of EU Project Networking Session, 2015Konferensbidrag (Övrigt vetenskapligt)
    Ladda ner fulltext (pdf)
    fulltext
  • 20.
    Pocklington, Michael
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Eggers, Anna-Grit
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Corubolo, Fabio
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Hedges, Mark
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Ludwig, Jens
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    A Biological Perspective on Digital Ecosystems and Digital Preservation2014Ingår i: Proceedings of the 11th International Conference on Digital Preservation, 6-10 October, 2014, Melbourne, Australia / [ed] Serena Coates, Ross King, Steve Knight, Christopher Lee, Peter McKinney, Erin O'Meara, David Pearson, 2014, s. 363-365Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Successful preservation of Digital Objects (DOs) ultimately demands a solid theoretical framework. Such a framework with a high degree of generality emerges by treating DOs as containers of functional genetic information, exactly as in the genomes of organisms. We observe that functionality links survival in organisms and utility in DOs. In both cases, functional information is identifiable in principle by the consequence of its ablation. In molecular biology, genetic ablations (mutations) and environmental ablations (experimental manipulations) are used to construct interaction maps fully representing organismic activity. The equivalent of such interaction maps are dependency networks for the use of DOs within their Digital Environment (DE). In the poster we will present early work on the application of the theoretical background. It includes first results from a case-study examining a software-based art preservation scenario (SBA) developed as part of the PERICLES FP7 project [1].

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 21.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Liu, Ying-Hsang
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    A Vector Field Approach to Lexical Semantics2014Konferensbidrag (Refereegranskat)
    Abstract [en]

    We report work in progress on measuring "forces" underlying the semantic drift by comparing it with plate tectonics in geology. Based on a brief survey of energy as a key concept in machine learning, and the Aristotelian concept of potentiality vs. actuality allowing for the study of energy and dynamics in language, we propose a field approach to lexical analysis. Until evidence to the contrary, it was assumed that a classical field in physics is appropriate to model word semantics. The approach used the distributional hypothesis to statistically model word meaning. We do not address the modelling of sentence meaning here. The computability of a vector field for the indexing vocabulary of the Reuters-21578 test collection by an emergent self-organizing map suggests that energy minima as learnables in machine learning presuppose concepts as energy minima in cognition. Our finding needs to be confirmed by a systematic evaluation.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 22.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Accelerating Text Mining Workloads in a MapReduce-based Distributed GPU Environment2013Ingår i: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 73, nr 2, s. 198-206Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Scientific computations have been using GPU-enabled computers successfully, often relying on distributed nodes to overcome the limitations of device memory. Only a handful of text mining applications benefit from such infrastructure. Since the initial steps of text mining are typically data intensive, and the ease of deployment of algorithms is an important factor in developing advanced applications, we introduce a flexible, distributed, MapReduce-based text mining workflow that performs I/O-bound operations on CPUs with industry-standard tools and then runs compute-bound operations on GPUs which are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050s attached to each, and we achieve considerable speedups for random projection and self-organizing maps.

    Ladda ner fulltext (pdf)
    fulltext
  • 23.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Koopman, Bevan
    Zuccon, Guido
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Combining Word Semantics within Complex Hilbert Space for Information Retrieval2013Konferensbidrag (Refereegranskat)
    Abstract [en]

    Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 24.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Demonstrating Conceptual Dynamics in an Evolving Text Collection2013Ingår i: Journal of the Association for Information Science and Technology, ISSN 2330-1635, E-ISSN 2330-1643, Vol. 64, nr 12, s. 2564-2572Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Based on real world user demands, we demonstrate how animated visualisation of evolving text corpora displays the underlying dynamics of semantic content. To interpret the results, one needs a dynamic theory of word meaning. We suggest that conceptual dynamics as the interaction between kinds of intellectual, emotional etc. content, and language, is key for such a theory. We demonstrate our methodology by two-way seriation which is a popular technique to analyse groups of similar instances and their features, as well as the connections between the groups themselves. The two-way seriated data may be visualised as a two-dimensional heat map or as a three-dimensional landscape where colour codes or height correspond to the values in the matrix. In this paper we focus on two-way seriation of sparse data in the Reuters-21568 test collection. To achieve a meaningful visualisation thereof we introduce a compactly supported convolution kernel similar to filter kernels used in image reconstruction and geostatistics. This filter populates the high-dimensional sparse space with values that interpolate nearby elements, and provides insight into the clustering structure. We also extend two-way seriation to deal with online updates of both the row and column spaces, and, combined with the convolution kernel, demonstrate a three-dimensional visualisation of dynamics.

    Ladda ner fulltext (pdf)
    fulltext
  • 25. Ofek, Nir
    et al.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Rokach, Lior
    Linking Motif Sequences to Tale Types by Machine Learning2013Konferensbidrag (Refereegranskat)
    Abstract [en]

    Abstract units of narrative content called motifs constitute sequences, also known as tale types. However whereas the dependency of tale types on the constituent motifs is clear, the strength of their bond has not been measured this far. Based on the observation that differences between such motif sequences are reminiscent of nucleotide and chromosome mutations in genetics, i.e., constitute “narrative DNA”, we used sequence mining methods from bioinformatics to learn more about the nature of tale types as a corpus. 94% of the Aarne-Thompson-Uther catalogue (2249 tale types in 7050 variants) was listed as individual motif strings based on the Thompson Motif Index, and scanned for similar subsequences. Next, using machine learning algorithms, we built and evaluated a classifier which predicts the tale type of a new motif sequence. Our findings indicate that, due to the size of the available samples, the classification model was best able to predict magic tales, novelles and jokes.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 26. Hedges, Mark
    et al.
    Waddington, Simon
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Maceviciute, Elena
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wilson, Tom
    Kompatsiaris, Yiannis
    Dasiopoulu, Stamatia
    Spyroglou, Odysseas
    Ludwig, Jens
    Wieder, Philipp
    Watry, Paul
    Hasan, Adil
    Corubolo, Fabio
    Pinchuk, Rani
    Chanod, Jean-Pierre
    Vion-Dury, Jean-Yves
    Baxter, Rob
    Laurenson, Pip
    Muller, Christian
    PERICLES: Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics2013Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    This poster paper describes the objectives, approach and use cases of the EC FP7 Integrated Project PERICLES. The project began on 1st February 2013 and runs for four years. The aim is to research and prototype solutions for digital preservation in continually evolving environments including changes in context, semantics and practices. The project addresses use cases focusing on digital art, media and science.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 27.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Kitto, Kirsty
    The Sphynx's new riddle: How to relate the canonical formula of myth to quantum interaction2013Konferensbidrag (Refereegranskat)
    Abstract [en]

    We introduce Claude Lévi Strauss' canonical formula (CF), an attempt to rigorously formalise the general narrative structure of myth. This formula utilises the Klein group as its basis, but a recent work draws attention to its natural quaternion form, which opens up the possibility that it may require a quantum inspired interpretation. We present the CF in a form that can be understood by a non-anthropological audience, using the formalisation of a key myth (that of Adonis) to draw attention to its mathematical structure. The future potential formalisation of mythological structure within a quantum inspired framework is proposed and discussed, with a probabilistic interpretation further generalising the formula.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 28.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment.2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, and we achieve a 10x speedup for self-organizing maps over a distributed CPU algorithm.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 29.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Connecting the Dots: Mass, Energy, Word Meaning, and Particle-Wave Duality2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    With insight from linguistics that degrees of text cohesion are similar to forces in physics, and the frequent use of the energy concept in text categorization by machine learning, we consider the applicability of particle-wave duality to semantic content inherent in index terms. Wave-like interpretations go back to the regional nature of such content, utilizing functions for its representation, whereas content as a particle can be conveniently modelled by position vectors. Interestingly, wave packets behave like particles, lending credibility to the duality hypothesis. We show in a classical mechanics framework how metaphorical term mass can be computed.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 30.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Forró, László
    Detecting Multiple Motif Co-occurrences in the Aarne-Thompson- Uther Tale Type Catalog: A Preliminary Survey2012Ingår i: Anales de Documentación, ISSN 1575-2437, E-ISSN 1697-7904, Vol. 15, nr 1Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    : Catalogs project subject field experience onto a multidimensional map which is then converted to a hierarchical list. In the case of the Aarne-Thompson-Uther Tale Type Catalog (ATU), this subject field is the global pattern of tale content defining tale types as canonical motif sequences. To extract and visualize such a map, we considered ATU as a corpus and analysed two segments of it, “Supernatural adversaries” (types 300-399) in particular and “Tales of magic” (types 300-749) in general. The two corpora were scrutinized for multiple motif cooccurrences and visualized by two-mode clustering of a bag-of-motif co-occurrences matrix. Findings indicate the presence of canonical content units above motif level as well. The organization scheme of folk narratives utilizing motif sequences is reminiscent of nucleotid sequences in the genetic code.

  • 31.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Digital Preservation in Grids and Clouds: A Middleware Approach2012Ingår i: Journal of Grid Computing, ISSN 1570-7873, E-ISSN 1572-9184, Vol. 10, nr 1, s. 133-149Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Digital preservation is the persistent archiving of digital assets for future access and reuse, irrespective of the underlying platform and software solutions. Existing preservation systems have a strong focus on grids, but the advent of cloud technologies offers an attractive option. We describe a middleware system that enables a flexible choice between a grid and a cloud for ad-hoc computations that arise during the execution of a preservation workflow and also for archiving digital objects. The choice between different infrastructures remains open during the lifecycle of the archive, ensuring a smooth switch between different solutions to accommodate the changing requirements of the organization that needs its digital assets preserved. We also offer insights on the costs, running times, and organizational issues of cloud computing, proving that the cloud alternative is particularly attractive for smaller organizations without access to a grid or with limited IT infrastructure.

    Ladda ner fulltext (pdf)
    fulltext
  • 32. Declerck, Thierry
    et al.
    Lendvai, Piroska
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Multilingual and Semantic Extension of Folk Tale Catalogues2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    We address the multilingual and semantic upgrades of two digital catalogues of motifs and types in folk-literature: the Thompson’s Motif-Index of Folk-Literature (TMI) and the Aarne-Thompson-Uther classification system (ATU). The methods convert, translate, and represent their digitized content in terms of various (so far often implicit) structural and linguistic components. The results will enable (i) utilizing these resources for semi-automatic analysis and indexing of texts of relevant genres, in a multilingual setting, and (ii) pre-processing the data, for analysing motif sequences in folktale plots. We plan to publish the resulting data, which can be made available in the Linked Open Data (LOD) framework.

  • 33.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    The gravity of meaning: Physics as a metaphor to model semantic changes2012Konferensbidrag (Refereegranskat)
    Abstract [en]

    Based on a computed toy example, we offer evidence that by plugging in similarity of word meaning as a force plus a small modification of Newton’s 2nd law, one can acquire specific “mass” values for index terms in a Saltonesque dynamic library environment. The model can describe two types of change which affect the semantic composition of document collections: the expansion of a corpus due to its update, and fluctuations of the gravitational potential energy field generated by normative language use as an attractor juxtaposed with actual language use yielding time-dependent term frequencies. By the evolving semantic potential of a vocabulary and concatenating the respective term “mass” values, one can model sentences or longer strings of symbols as vector-valued functions. Since the line integral of such functions is used to express the work of a particle in a gravitational field, the work equivalent of strings can be calculated.

  • 34.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    The importance of context for Digital Libraries2012Ingår i: Cuadernos de Gestión de Información, ISSN 2253-8429, Vol. 2, nr 1, s. 1-3Artikel i tidskrift (Övrigt vetenskapligt)
    Abstract [en]

    The concept of "context" has great importance in digital preservation. This paper analyzes the meaning of context from the point of view of access to digital objects, combining linguistics, terminological disambiguation in information retrieval and text categorization aspects. In these areas, the context is a key element for successful disambiguation and thus get better results. Therefore, the preservation and subsequent access of digital objects should also consider the preservation of appropriate information about the terminology and social context in which these objects were generated.

  • 35.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Maceviciute, Elena
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wilson, Tom
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    The SHAMAN project on digital preservation2012Konferensbidrag (Övrigt vetenskapligt)
    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 36.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Forró, László
    Toward Sequencing “Narrative DNA”: Tale Types, Motif Strings and Memetic Pathways2012Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    The Aarne-Thompson-Uther Tale Type Catalog (ATU) is a bibliographic tool which uses metadata from tale content, called motifs, to define tale types as canonical motif sequences. The motifs themselves are listed in another bibliographic tool, the Aarne-Thompson Motif Index (AaTh). Tale types in ATU are defined in an abstracted fashion and can be processed like a corpus. We analyzed 219 types with 1202 motifs from the “Tales of magic” (types 300-749) segment to exemplify that motif sequences show signs of recombination in the storytelling process. Compared to chromosome mutations in genetics, we offer examples for insertion/deletion, duplication and, possibly, transposition, whereas the sample was not sufficient to find inverted motif strings as well. These initial findings encourage efforts to sequence motif strings like DNA in genetics, attempting to find for instance the longest common motif subsequences in tales. Expressing the network of motif connections by graphs suggests that tale plots as consolidated pathways of content help one memorize culturally engraved messages. We anticipate a connection between such networks and addington’s epigenetic landscape.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 37.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Forró, László
    Detecting Multiple Motif Co-occurrences in the Aarne-Thompson-Uther Tale Type Catalog: A Preliminary Survey2011Ingår i: Anales de Documentación, ISSN 1575-2437, E-ISSN 1697-7904Artikel i tidskrift (Övrigt vetenskapligt)
    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 38.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Introducing Scalable Quantum Approaches in Language Representation2011Konferensbidrag (Refereegranskat)
    Abstract [en]

    High-performance computational resources and distributed systems are crucial for the success of real-world language technology applications. The novel paradigm of general-purpose computing on graphics processors (GPGPU) o ers a feasible and economical alternative: it has already become a common phenomenon in scienti c computation, with many algorithms adapted to the new paradigm. However, applications in language technology do not readily adapt to this approach. Recent advances show the applicability of quantum metaphors in language representation, and many algorithms in quantum mechanics have already been adapted to GPGPU computing. SQUALAR aims to match quantum algorithms with heterogeneous computing to develop new formalisms of information representation for natural language processing in quantum environments.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 39.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Leveraging on High-Performance Computing and Cloud Technologies in Digital Libraries: A Case Study2011Konferensbidrag (Refereegranskat)
    Abstract [en]

    With the emergence of high-performance computing instances in the cloud, massive scale computations have become available to technically every organization. Digital libraries typically employ a data-intensive infrastructure, but given the resources, advanced services based on data and text mining could be developed. A fundamental issue is the ease of development and integration of such services. We demonstrate the feasibility by providing a case study on a visual machine learning algorithm with MapReduce running in the cloud in a small cluster.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 40.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Spectral Composition of Semantic Spaces2011Konferensbidrag (Refereegranskat)
    Abstract [en]

    Spectral theory in mathematics is key to the success of as diverse application domains as quantum mechanics and latent semantic indexing, both relying on eigenvalue decomposition for the localization of their respective entities in observation space. This points at some implicit \energy" inherent in semantics and in need of quanti cation. We show how the structure of atomic emission spectra, and meaning in concept space, go back to the same compositional principle, plus propose a tentative solution for the computation of term, document and collection \energy" content.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 41.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Forró, László
    Toward Sequencing Multiple Motif Co-Occurrences2011Ingår i: Tanulmányok az örökségmenedzsmentröl 2. Kulturális örökségek kezelése [Studies in Heritage Management 2: The Management of Cultural Heritage]. / [ed] L. Bassa, Információs Társadalomért Alapítvány , 2011, s. 247-260Kapitel i bok, del av antologi (Refereegranskat)
    Abstract [en]

    Catalogs project subject field experience onto a multidimensional map which is then converted to a hierarchical list. In the case of the Aarne-Thompson-Uther Tale Type Catalog (ATU), this subject field is the global pattern of tale content defining tale types as canonical motif sequences. To extract and visualize such a map, we considered ATU as a corpus and ana-lysed two segments of it, “Supernatural adversaries” (types 300-399) in particular and “Tales of magic” (types 300-749) in general. The two corpora were scru-tinized for multiple motif co-occurrences and visualized by two-mode clustering of a bag-of-motif co-occurrences matrix. Findings indicate the presence of canonical content units above motif level as well. The organization scheme of folk narratives utilizing motif sequences is reminiscent of nucleotid sequences in the genetic code

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 42.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Dobreva, Milena
    Using wavelet analysis for text categorization in digital libraries: a first experiment with Strathprints2011Ingår i: International Journal on Digital Libraries, ISSN 1432-5012, E-ISSN 1432-1300Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Digital libraries increasingly bene t from re- search on automated text categorization for improved access. Such research is typically carried out by using standard test collections. In this paper we present a pilot experiment of replacing such test collections by a set of 6000 objects from a real-world digital repos- itory, indexed by Library of Congress Subject Head- ings, and test support vector machines in a supervised learning setting for their ability to reproduce the exist- ing classi cation. To augment the standard approach, we introduce a combination of two novel elements: us- ing functions for document content representation in Hilbert space, and adding extra semantics from lexical resources to the representation. Results suggest that wavelet-based kernels slightly outperformed traditional kernels on classi cation reconstruction from abstracts and vice versa from full-text documents, the latter out- come due to word sense ambiguity. The practical imple- mentation of our methodological framework enhances the analysis and representation of speci c knowledge relevant to large-scale digital collections, in this case the thematic coverage of the collections. Representation of speci c knowledge about digital collections is one of the basic elements of the persistent archives and the less studied one (compared to representations of digital ob- jects and collections). Our research is an initial step in this direction developing further the methodological ap- proach and demonstrating that text categorisation can be applied to analyse the thematic coverage in digital repositories.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 43.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Jacquin, Thierry
    Déjean, Hervé
    Chanod, Jean-Pierre
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    XML Processing in the Cloud: Large-Scale Digital Preservation in Small Institutions2011Konferensbidrag (Refereegranskat)
    Abstract [en]

    Abstract—Digital preservation deals with the problem of retaining the meaning of digital information over time to ensure its accessibility. The process often involves a workflow which transforms the digital objects. The workflow defines document pipelines containing transformations and validation checkpoints, either to facilitate migration for persistent archival or to extract metadata. The transformations, nevertheless, are computationally expensive, and therefore digital preservation can be out of reach for an organization whose core operation is not in data conservation. The operations described the document workflow, however, do not frequently reoccur. This paper combines an implementation-independent workflow designer with cloud computing to support small institution in their adhoc peak computing needs that stem from their efforts in digital preservation.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 44.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Examples of Formulaity in Narratives and Scientific Communication2010Ingår i: Proceedings of the 1st International AMICUS Workshop, October 21, 2010, Vienna, Austria / [ed] Sándor Darányi, Piroska Lendvai, University of Szeged, Hungary , 2010, s. 29-35Konferensbidrag (Refereegranskat)
    Abstract [en]

    The AMICUS project was designed to promote scholarly networking in a topical area, motif recognition in texts, including its automation. Prior to doing so however it is necessary to show the theoretical underpinnings of the research idea. My argument is that evidence from different disciplines amounts to fragmented pieces of a bigger picture. By compiling them like pieces of a puzzle, one can see how the concept of formulaity applies to folklore texts and scholarly communication alike. Regardless of the actual name of the concept (e.g. motif, function, canonical form), what matters is that document parts and whole documents can be characterized by standard sequences of content elements, such formulaic expressions enabling higher-level document indexing and classification by machine learning, plus document retrieval. Information filtering plays a key role in the proposed technology.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 45. Lendvai, Piroska
    et al.
    Declerck, Thierry
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Gervás, Pablo
    Hervás, Raquel
    Malec, Scott
    Peinado, Federico
    Integration of Linguistic Markup into Semantic Models of Folk Narratives: The Fairy Tale Use Case. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)2010Konferensbidrag (Refereegranskat)
    Abstract [en]

    Propp’s influential structural analysis of fairy tales created a powerful schema for representing storylines in terms of character functions, which is directly exploitable for computational semantic analysis, and procedural generation of stories of this genre. We tackle two resources that draw on the Proppian model –, one formalizes it as a semantic markup scheme and the other as an ontology – both lacking linguistic phenomena explicitly represented in them. The need for integrating linguistic information into structured semantic resources is motivated by the emergence of suitable standards that facilitate this, and the benefits such joint representation would create for transdisciplinary research across Digital Humanities, Computational Linguistics, and Artificial Intelligence.

    Ladda ner fulltext (pdf)
    fulltext
  • 46.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Dobreva, Milena
    Matching Evolving Hilbert Spaces and Language for Semantic Access to Digital Libraries2010Ingår i: The Role of Digital Libraries in a Time of Global Change. Proceedings of the 12th International Conference on Asia-Pacific Digital Libraries, ICADL 2010, Gold Coast, Australia, June 21-25, 2010. / [ed] Gobinda Chowdhury, Chris Koo, Jane Hunter, Springer , 2010, s. 262-263Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Extended by function (Hilbert) spaces, the 5S model of digital libraries (DL) [1] enables a physical interpretation of vectors and functions to keep track of the evolving semantics and usage context of the digital objects by support vector machines (SVM) for text categorization (TC). For this conceptual transition, three steps are necessary: (1) the application of the formal theory of DL to Lebesgue (function, L2) spaces; (2) considering semantic content as vectors in the physical sense (i.e. position and direction vectors) rather than as in linear algebra, thereby modelling word semantics as an evolving field underlying classifications of digital objects; (3) the replacement of vectors by functions in a new compact support basis function (CSBF) semantic kernel utilizing wavelets for TC by SVMs.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 47.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Dobreva, Milena
    Position paper: Adding a 5M layer to the 5S model of digital libraries.2010Konferensbidrag (Refereegranskat)
    Abstract [en]

    We expect radical changes in document ( rst and foremost text) representation for digital libraries (DL) leading to new applications for documents processing.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 48.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Lendvai, Piroska
    Proceedings of the First AMICUS Workshop, October 21, 2010 Vienna, Austria2010Samlingsverk (redaktörskap) (Övrigt vetenskapligt)
    Abstract [en]

    In cultural heritage objects, digitized or not, content indicators occurring on higher than word level are often called motifs or their equivalent. Their recognition for document classification and retrieval is largely unresolved. Work on identifying rhetorical, narrative and persuasive elements in scientific texts has been progressing, in several, but largely unconnected tracks. The AMICUS project1 (running between 2009 and 2012) set out to test a possible way to resolve these issues, starting with the identification of Proppian functions in folk tale corpora and adapting the solution to the identification of tale motifs or their functional counterparts. AMICUS has devoted its first project year to listing the corpora, tools, methods and contacts available to address these issues. The initiators of the project have identified a common need in the processing of texts from both the cultural heritage (CH) and scientific communication (SC) domains: to perform automated, large-scale higher-order text analytics, i.e., to reach an advanced level of text understanding so that structured knowledge can be extracted from unstructured text. The four research groups propose to tackle an important aspect of this complex issue by investigating how linguistic elements convey motifs in texts from the CH and the SC domains. Our shared working hypothesis is that the identity of higherorder content-bearing elements, i.e., textual units that are typically designated for e.g. document indexing, classification, enrichment, and the like, strongly depends on community perception.

    Ladda ner fulltext (pdf)
    fulltext
  • 49. Lendvai, Piroska
    et al.
    Declerck, Thierry
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Malec, Scott
    Propp Revisited: Integration of Linguistic Markup into Structured Content Descriptors of Tales2010Ingår i: Proceedings of the Conference for Digital Humanities 2010, 2010Konferensbidrag (Refereegranskat)
    Abstract [en]

    Metadata that serve as semantic markup, such as conceptual categories that describe the macrostructure of a plot in terms of actors and their mutual relationships, actions, and their ingredients annotated in folk narratives, are important additional resources of digital humanities research. Traditionally originating in structural analysis, in fairy tales they are called functions (Propp, 1968), whereas in myths – mythemes (Lévi-Strauss, 1955); a related, overarching type of content metadata is a folklore motif (Uther, 2004; Jason, 2000).In his influential study, Propp treated a corpus of tales in Afanas'ev's collection (Afanas'ev, 1945), establishing basic recurrent units of the plot ('functions'), such as Villainy, Liquidation of misfortune, Reward, or Test of Hero, and the combinations and sequences of elements employed to arrange them into moves.1 His aim was to describe the DNAlike structure of the magic tale sub-genre as a novel way to provide comparisons. As a start along the way to developing a story grammar, the Proppian model is relatively straightforward to formalize for computational semantic annotation, analysis, and generation of fairy tales. Our study describes an effort towards creating a comprehensive XML markup of fairy tales following Propp's functions, by an approach that integrates functional text annotation with grammatical markup in order to be used across text types, genres and languages. The Proppian fairy tale Markup Language (PftML) (Malec, 2001) is an annotation scheme that enables narrative function segmentation, based on hierarchically ordered textual content objects. We propose to extend PftML so that the scheme would additionally rely on linguistic information for the segmentation of texts into Proppian functions. Textual variation is an important phenomenon in folklore, it is thus beneficial to explicitly represent linguistic elements in computational resources that draw on this genre; current international initiatives also actively promote and aim to technically facilitate such integrated and standardized linguistic resources. We describe why and how explicit representation of grammatical phenomena in literary models can provide interdisciplinary benefits for the digital humanities research community. In two related fields of activities, we address the above as part of our ongoing activities in the CLARIN2 and AMICUS3 projects. CLARIN aims to contribute to humanities research by creating and recommending effective workflows using natural language processing tools and digital resources in scenarios where text-based research is conducted by humanities or social sciences scholars. AMICUS is interested in motif identification, in order to gain insight into higher-order correlations of functions and other content units in texts from the cultural heritage and scientific discourse domains. We expect significant synergies from their interaction with the PftML prototype.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 50. Szöts, Miklós
    et al.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Alexin, Zoltán
    Vincze, Veronika
    Almási, Attila
    Semantic Processing of a Hungarian Ethnographic Corpus2010Ingår i: Proceedings of the 1st International AMICUS Workshop, October 21, 2010, Vienna, Austria, s. 112-115Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this poster, a Hungarian ethnographic database containing linguistic annotation is presented. The corpus contains texts from three domains, namely, folk beliefs, t altos texts and tales. All the possible morphosyntactic analyses assigned to each word and the appropriate one selected from them (based on contextual information) are also marked. Syntactic (dependency) annotation is added semi-automatically to the corpus texts at a second phase of the processing. With the help of these enriched linguistic attributes, the texts can be semantically analyzed and clustered. The research and development team is working on a semantic search tool enabling to browse the texts on the basis of their semantic meaning. The proposed technology may result in a new approach to the ethnographic research and may open a new type of access to the databases.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 51.
    Darányi, Sandor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Dobreva, Milena
    Toward a 5M Model of Digital Libraries2010Konferensbidrag (Refereegranskat)
    Abstract [en]

    Whereas the DELOS DRM and the 5S model of digital libraries (DL) addresses the formal side of DL, we argue that a parallel 5M model is emerging as best practice worldwide, integrating multicultural, multilingual, multimodal digital objects with multivariate statistics-based document indexing, categorization and retrieval methods. The fifth M stands for the modeling the information searching behavior of users, and of collection development. We show how an extension of the 5S model to Hilbert space (a) points toward the integration of several Ms; (b) makes the tracking of evolving semantic content feasible, and (c) leads to a field interpretation of word and sentence semantics underlying language change. First experimental results from the Strathprints e-repository verify the mathematical foundations of the 5M model.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 52.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Tan, Chew Lim
    An Ordering of Terms Based on Semantic Relatedness2009Ingår i: Proceedings of IWCS-8, January 7-9, 2009, Tilburg, The Netherlands / [ed] H Bunt, V Petukhova, S Wubben, 2009, s. 235-247Konferensbidrag (Refereegranskat)
    Abstract [en]

    Term selection methods typically employ a statistical measure to filter or weight terms. Term expansion for IR may also depend on statistics, or use some other, non-metric method based on a lexical resource. At the same time, a wide range of semantic similarity measures have been developed to support natural language processing tasks such as word sense disambiguation. This paper combines the two approaches and proposes an algorithm that provides a semantic order of terms based on a semantic relatedness measure. This semantic order can be exploited by term weighting and term expansion methods.

  • 53.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Creating a dynamic library in the LIVA project: challenges and solutions2009Konferensbidrag (Övrigt vetenskapligt)
  • 54.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Tan, Chew Lim
    Improving text classification by a sense spectrum approach to term expansion2009Ingår i: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, 2009, s. 183-191Konferensbidrag (Refereegranskat)
  • 55.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Wittek, Peter
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    On Information, Meaning, Space and Geometry2009Ingår i: Exploration of Space, Technology and Spatiality: Interdisciplinary Perspectives / [ed] Susan Turner, E. D. P. Turner, Hersey: Idea Group , 2009Kapitel i bok, del av antologi (Övrigt vetenskapligt)
    Abstract [en]

    We offer a few general considerations, with theoretical overtones, working toward the definition and generation of a geometric language for practical purposes, prominently for information retrieval. This chapter is a non-mathematical introduction to the mathematical modelling of meaning of both words and sentences, outlining already existing components of such an endeavour, and hinting at directions of synthesis.

  • 56. Berger, Gertrud
    et al.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Eklund, Johan
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Hallén, Maivor
    Höglund, Lars
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Information visualization for product development in the LIVA project2008Ingår i: InfoTrend, ISSN 1653-0225, Vol. 63, nr 1, s. 3-13Artikel i tidskrift (Övrig (populärvetenskap, debatt, mm))
    Abstract [en]

    The LIVA research and development project (2005-2007) was conceived to integrate automatic indexing, automatic categorization, information visualization and information retrieval in library systems managing textual document collections. After a brief overview of some major information visualization methods, the user interface prototype is introduced.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 57.
    Darányi, Sándor
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Eklund, Johan
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Automated text categorization of bibliographic records2007Ingår i: Svensk biblioteksforskning, ISSN 0284-4354, E-ISSN 1653-5235, Vol. 16, nr 2, s. 1-14Artikel i tidskrift (Refereegranskat)
  • 58.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    First- and second-order change as symmetry and symmetry breaking in folklore text content evolution: From Heraclitus to Lévi-Strauss2007Konferensbidrag (Refereegranskat)
    Abstract [en]

    We distinguish between first- and second order change and identify the former with perpetual alternation on an existential plane, the second with moving out into existential space. The first type can be demonstrated by two antagonistic processes inherent in a Markov chain of two pairs of complementary values: the chain gradually alternates between the opposite terminal states and the pattern is symmetrical. Such an existential plane catches an essential feature of Heraclitus’ philosophy, and can be illustrated by examples from classical Greek mythology. The same material also exemplifies Lévi-Strauss’ formula of myth, symmetrical in its weak and asymmetrical in 2 its canonical form. Since the weak form equals the orbit of a Klein group, we hypothesize that the canonical form, and thereby symmetry breaking, can be generated by element exchange between two respective Klein groups. The framework for such processes is text variation in folklore, described by ethnosemiotics.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 59.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Látvány és jelentés: Budapesti épuletszobrok elemzése és fejlödéstörténeti modellezése2007Konferensbidrag (Övrigt vetenskapligt)
  • 60.
    Wittek, Peter
    et al.
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Representing word semantics for IR by continuous functions2007Ingår i: Studies in Theory of Information Retrieval. Proceedings of the ICTIR07 Conference, Budapest, 18-20 October 2007 / [ed] Sándor Dominich, Ferenc Kiss, Foundation for Information Society, Budapest , 2007, s. 149-155Konferensbidrag (Refereegranskat)
    Abstract [en]

    Information representation is an important but neglected aspect of building text information retrieval models. In order to be efficient, the mathematical objects of a formal model, like vectors, have to reasonably reproduce language-related phenomena such as word meaning inherent in index terms. On the other hand, the classical vector space model, when it comes to the representation of word meaning, is approximative only, whereas it exactly localizes term, query and document content. It can be shown that by replacing vectors by continuous functions, information retrieval in Hilbert space yields comparable or better results. This is because according to the non-classical or continuous vector space model, content cannot be exactly localized. At the same time, the model relies on a richer representation of word meaning than the VSM can offer.

    Ladda ner fulltext (pdf)
    FULLTEXT01
  • 61. Dominich, Sándor
    et al.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Szlávik, Z.
    Magyar hiedelmek hierarchikus struktúrája [The hierarchical structure of Hungarian folk beliefs].2006Ingår i: Alkalmazott Nyelvtudomány, ISSN 1587-1061, Vol. 6, nr 1-2, s. 137-160Artikel i tidskrift (Refereegranskat)
  • 62.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    A computationally and neurologically feasible model of semiosis2004Ingår i: From Nature to Psyche. Proceedings from the Imatra International Congresses on Semiotics in 2001 and 2002 / [ed] Eero Tarasti, Helsinki: Acta Semiotica Fennica , 2004, s. 256-264Konferensbidrag (Övrigt vetenskapligt)
  • 63.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Language as space2004Ingår i: More Space. Proceedings of the "Space, Spatiality and Technology Workshop 2004". Edinburgh: School of Computing, Napier University / [ed] P. Turner, E. Davenport, S. Turner, 2004, s. 60-64Konferensbidrag (Övrigt vetenskapligt)
  • 64.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    Factor analysis and the canonical formula: Where do we go from here?2003Konferensbidrag (Refereegranskat)
  • 65.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    HOMO 2003: Information society, cultural heritage and folklore text analysis2003Samlingsverk (redaktörskap) (Övrigt vetenskapligt)
  • 66.
    Darányi, Sándor
    Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan.
    The necessity of a European virtual laboratory for the processing of digitized cultural heritage: The HOMO concept2003Konferensbidrag (Refereegranskat)
1 - 66 av 66
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • harvard-cite-them-right
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf