Change search
Refine search result
12 1 - 50 of 61
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1. Berger, Gertrud
    et al.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Eklund, Johan
    University of Borås, Swedish School of Library and Information Science.
    Hallén, Maivor
    Höglund, Lars
    University of Borås, Swedish School of Library and Information Science.
    Information visualization for product development in the LIVA project2008In: InfoTrend, ISSN 1653-0225, Vol. 63, no 1, p. 3-13Article in journal (Other (popular science, discussion, etc.))
    Abstract [en]

    The LIVA research and development project (2005-2007) was conceived to integrate automatic indexing, automatic categorization, information visualization and information retrieval in library systems managing textual document collections. After a brief overview of some major information visualization methods, the user interface prototype is introduced.

  • 2.
    Darányi, Sandor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Dobreva, Milena
    Toward a 5M Model of Digital Libraries2010Conference paper (Refereed)
    Abstract [en]

    Whereas the DELOS DRM and the 5S model of digital libraries (DL) addresses the formal side of DL, we argue that a parallel 5M model is emerging as best practice worldwide, integrating multicultural, multilingual, multimodal digital objects with multivariate statistics-based document indexing, categorization and retrieval methods. The fifth M stands for the modeling the information searching behavior of users, and of collection development. We show how an extension of the 5S model to Hilbert space (a) points toward the integration of several Ms; (b) makes the tracking of evolving semantic content feasible, and (c) leads to a field interpretation of word and sentence semantics underlying language change. First experimental results from the Strathprints e-repository verify the mathematical foundations of the 5M model.

  • 3.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    A computationally and neurologically feasible model of semiosis2004In: From Nature to Psyche. Proceedings from the Imatra International Congresses on Semiotics in 2001 and 2002 / [ed] Eero Tarasti, Helsinki: Acta Semiotica Fennica , 2004, p. 256-264Conference paper (Other academic)
  • 4.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Creating a dynamic library in the LIVA project: challenges and solutions2009Conference paper (Other academic)
  • 5.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Examples of Formulaity in Narratives and Scientific Communication2010In: Proceedings of the 1st International AMICUS Workshop, October 21, 2010, Vienna, Austria / [ed] Sándor Darányi, Piroska Lendvai, University of Szeged, Hungary , 2010, p. 29-35Conference paper (Refereed)
    Abstract [en]

    The AMICUS project was designed to promote scholarly networking in a topical area, motif recognition in texts, including its automation. Prior to doing so however it is necessary to show the theoretical underpinnings of the research idea. My argument is that evidence from different disciplines amounts to fragmented pieces of a bigger picture. By compiling them like pieces of a puzzle, one can see how the concept of formulaity applies to folklore texts and scholarly communication alike. Regardless of the actual name of the concept (e.g. motif, function, canonical form), what matters is that document parts and whole documents can be characterized by standard sequences of content elements, such formulaic expressions enabling higher-level document indexing and classification by machine learning, plus document retrieval. Information filtering plays a key role in the proposed technology.

  • 6.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Factor analysis and the canonical formula: Where do we go from here?2003Conference paper (Refereed)
  • 7.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    First- and second-order change as symmetry and symmetry breaking in folklore text content evolution: From Heraclitus to Lévi-Strauss2007Conference paper (Refereed)
    Abstract [en]

    We distinguish between first- and second order change and identify the former with perpetual alternation on an existential plane, the second with moving out into existential space. The first type can be demonstrated by two antagonistic processes inherent in a Markov chain of two pairs of complementary values: the chain gradually alternates between the opposite terminal states and the pattern is symmetrical. Such an existential plane catches an essential feature of Heraclitus’ philosophy, and can be illustrated by examples from classical Greek mythology. The same material also exemplifies Lévi-Strauss’ formula of myth, symmetrical in its weak and asymmetrical in 2 its canonical form. Since the weak form equals the orbit of a Klein group, we hypothesize that the canonical form, and thereby symmetry breaking, can be generated by element exchange between two respective Klein groups. The framework for such processes is text variation in folklore, described by ethnosemiotics.

  • 8.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    HOMO 2003: Information society, cultural heritage and folklore text analysis2003Collection (editor) (Other academic)
  • 9.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Language as space2004In: More Space. Proceedings of the "Space, Spatiality and Technology Workshop 2004". Edinburgh: School of Computing, Napier University / [ed] P. Turner, E. Davenport, S. Turner, 2004, p. 60-64Conference paper (Other academic)
  • 10.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Látvány és jelentés: Budapesti épuletszobrok elemzése és fejlödéstörténeti modellezése2007Conference paper (Other academic)
  • 11.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    The importance of context for Digital Libraries2012In: Cuadernos de Gestión de Información, ISSN 2253-8429, Vol. 2, no 1, p. 1-3Article in journal (Other academic)
    Abstract [en]

    The concept of "context" has great importance in digital preservation. This paper analyzes the meaning of context from the point of view of access to digital objects, combining linguistics, terminological disambiguation in information retrieval and text categorization aspects. In these areas, the context is a key element for successful disambiguation and thus get better results. Therefore, the preservation and subsequent access of digital objects should also consider the preservation of appropriate information about the terminology and social context in which these objects were generated.

  • 12.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    The necessity of a European virtual laboratory for the processing of digitized cultural heritage: The HOMO concept2003Conference paper (Refereed)
  • 13.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Eklund, Johan
    University of Borås, Swedish School of Library and Information Science.
    Automated text categorization of bibliographic records2007In: Svensk biblioteksforskning, ISSN 0284-4354, E-ISSN 1653-5235, Vol. 16, no 2, p. 1-14Article in journal (Refereed)
  • 14.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Forró, László
    Detecting Multiple Motif Co-occurrences in the Aarne-Thompson- Uther Tale Type Catalog: A Preliminary Survey2012In: Anales de Documentación, ISSN 1575-2437, E-ISSN 1697-7904, Vol. 15, no 1Article in journal (Refereed)
    Abstract [en]

    : Catalogs project subject field experience onto a multidimensional map which is then converted to a hierarchical list. In the case of the Aarne-Thompson-Uther Tale Type Catalog (ATU), this subject field is the global pattern of tale content defining tale types as canonical motif sequences. To extract and visualize such a map, we considered ATU as a corpus and analysed two segments of it, “Supernatural adversaries” (types 300-399) in particular and “Tales of magic” (types 300-749) in general. The two corpora were scrutinized for multiple motif cooccurrences and visualized by two-mode clustering of a bag-of-motif co-occurrences matrix. Findings indicate the presence of canonical content units above motif level as well. The organization scheme of folk narratives utilizing motif sequences is reminiscent of nucleotid sequences in the genetic code.

  • 15.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Forró, László
    Detecting Multiple Motif Co-occurrences in the Aarne-Thompson-Uther Tale Type Catalog: A Preliminary Survey2011In: Anales de Documentación, ISSN 1575-2437, E-ISSN 1697-7904Article in journal (Other academic)
  • 16.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Forró, László
    Toward Sequencing Multiple Motif Co-Occurrences2011In: Tanulmányok az örökségmenedzsmentröl 2. Kulturális örökségek kezelése [Studies in Heritage Management 2: The Management of Cultural Heritage]. / [ed] L. Bassa, Információs Társadalomért Alapítvány , 2011, p. 247-260Chapter in book (Refereed)
    Abstract [en]

    Catalogs project subject field experience onto a multidimensional map which is then converted to a hierarchical list. In the case of the Aarne-Thompson-Uther Tale Type Catalog (ATU), this subject field is the global pattern of tale content defining tale types as canonical motif sequences. To extract and visualize such a map, we considered ATU as a corpus and ana-lysed two segments of it, “Supernatural adversaries” (types 300-399) in particular and “Tales of magic” (types 300-749) in general. The two corpora were scru-tinized for multiple motif co-occurrences and visualized by two-mode clustering of a bag-of-motif co-occurrences matrix. Findings indicate the presence of canonical content units above motif level as well. The organization scheme of folk narratives utilizing motif sequences is reminiscent of nucleotid sequences in the genetic code

  • 17.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Lendvai, Piroska
    Proceedings of the First AMICUS Workshop, October 21, 2010 Vienna, Austria2010Collection (editor) (Other academic)
    Abstract [en]

    In cultural heritage objects, digitized or not, content indicators occurring on higher than word level are often called motifs or their equivalent. Their recognition for document classification and retrieval is largely unresolved. Work on identifying rhetorical, narrative and persuasive elements in scientific texts has been progressing, in several, but largely unconnected tracks. The AMICUS project1 (running between 2009 and 2012) set out to test a possible way to resolve these issues, starting with the identification of Proppian functions in folk tale corpora and adapting the solution to the identification of tale motifs or their functional counterparts. AMICUS has devoted its first project year to listing the corpora, tools, methods and contacts available to address these issues. The initiators of the project have identified a common need in the processing of texts from both the cultural heritage (CH) and scientific communication (SC) domains: to perform automated, large-scale higher-order text analytics, i.e., to reach an advanced level of text understanding so that structured knowledge can be extracted from unstructured text. The four research groups propose to tackle an important aspect of this complex issue by investigating how linguistic elements convey motifs in texts from the CH and the SC domains. Our shared working hypothesis is that the identity of higherorder content-bearing elements, i.e., textual units that are typically designated for e.g. document indexing, classification, enrichment, and the like, strongly depends on community perception.

  • 18.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Maceviciute, Elena
    University of Borås, Swedish School of Library and Information Science.
    Wilson, Tom
    University of Borås, Swedish School of Library and Information Science.
    The SHAMAN project on digital preservation2012Conference paper (Other academic)
  • 19.
    Darányi, Sándor
    et al.
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Conceptual machinery of the mythopoetic mind: Attis, a case study2015In: Proceedings of QI-15, 9th International Quantum Interaction Symposium, 2015Conference paper (Refereed)
    Abstract [en]

    In search for the right interpretation regarding a body of related content, we screened a small corpus of myths about Attis, a minor deity from the Hellenistic period in Asia Minor to identify the noncommutativity of key concepts used in storytelling. Looking at the protagonist's typical features, our experiment showed incompatibility with regard to his gender and downfall. A crosscheck for entanglement found no violation of a Bell inequality, its best approximation being on the border of the local polytope.

  • 20.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Connecting the Dots: Mass, Energy, Word Meaning, and Particle-Wave Duality2012Conference paper (Refereed)
    Abstract [en]

    With insight from linguistics that degrees of text cohesion are similar to forces in physics, and the frequent use of the energy concept in text categorization by machine learning, we consider the applicability of particle-wave duality to semantic content inherent in index terms. Wave-like interpretations go back to the regional nature of such content, utilizing functions for its representation, whereas content as a particle can be conveniently modelled by position vectors. Interestingly, wave packets behave like particles, lending credibility to the duality hypothesis. We show in a classical mechanics framework how metaphorical term mass can be computed.

  • 21.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Demonstrating Conceptual Dynamics in an Evolving Text Collection2013In: Journal of the Association for Information Science and Technology, ISSN 2330-1635, E-ISSN 2330-1643, Vol. 64, no 12, p. 2564-2572Article in journal (Refereed)
    Abstract [en]

    Based on real world user demands, we demonstrate how animated visualisation of evolving text corpora displays the underlying dynamics of semantic content. To interpret the results, one needs a dynamic theory of word meaning. We suggest that conceptual dynamics as the interaction between kinds of intellectual, emotional etc. content, and language, is key for such a theory. We demonstrate our methodology by two-way seriation which is a popular technique to analyse groups of similar instances and their features, as well as the connections between the groups themselves. The two-way seriated data may be visualised as a two-dimensional heat map or as a three-dimensional landscape where colour codes or height correspond to the values in the matrix. In this paper we focus on two-way seriation of sparse data in the Reuters-21568 test collection. To achieve a meaningful visualisation thereof we introduce a compactly supported convolution kernel similar to filter kernels used in image reconstruction and geostatistics. This filter populates the high-dimensional sparse space with values that interpolate nearby elements, and provides insight into the clustering structure. We also extend two-way seriation to deal with online updates of both the row and column spaces, and, combined with the convolution kernel, demonstrate a three-dimensional visualisation of dynamics.

  • 22.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    On Information, Meaning, Space and Geometry2009In: Exploration of Space, Technology and Spatiality: Interdisciplinary Perspectives / [ed] Susan Turner, E. D. P. Turner, Hersey: Idea Group , 2009Chapter in book (Other academic)
    Abstract [en]

    We offer a few general considerations, with theoretical overtones, working toward the definition and generation of a geometric language for practical purposes, prominently for information retrieval. This chapter is a non-mathematical introduction to the mathematical modelling of meaning of both words and sentences, outlining already existing components of such an endeavour, and hinting at directions of synthesis.

  • 23.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    The gravity of meaning: Physics as a metaphor to model semantic changes2012Conference paper (Refereed)
    Abstract [en]

    Based on a computed toy example, we offer evidence that by plugging in similarity of word meaning as a force plus a small modification of Newton’s 2nd law, one can acquire specific “mass” values for index terms in a Saltonesque dynamic library environment. The model can describe two types of change which affect the semantic composition of document collections: the expansion of a corpus due to its update, and fluctuations of the gravitational potential energy field generated by normative language use as an attractor juxtaposed with actual language use yielding time-dependent term frequencies. By the evolving semantic potential of a vocabulary and concatenating the respective term “mass” values, one can model sentences or longer strings of symbols as vector-valued functions. Since the line integral of such functions is used to express the work of a particle in a gravitational field, the work equivalent of strings can be calculated.

  • 24.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Dobreva, Milena
    Position paper: Adding a 5M layer to the 5S model of digital libraries.2010Conference paper (Refereed)
    Abstract [en]

    We expect radical changes in document ( rst and foremost text) representation for digital libraries (DL) leading to new applications for documents processing.

  • 25.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Dobreva, Milena
    Using wavelet analysis for text categorization in digital libraries: a first experiment with Strathprints2011In: International Journal on Digital Libraries, ISSN 1432-5012, E-ISSN 1432-1300Article in journal (Refereed)
    Abstract [en]

    Digital libraries increasingly bene t from re- search on automated text categorization for improved access. Such research is typically carried out by using standard test collections. In this paper we present a pilot experiment of replacing such test collections by a set of 6000 objects from a real-world digital repos- itory, indexed by Library of Congress Subject Head- ings, and test support vector machines in a supervised learning setting for their ability to reproduce the exist- ing classi cation. To augment the standard approach, we introduce a combination of two novel elements: us- ing functions for document content representation in Hilbert space, and adding extra semantics from lexical resources to the representation. Results suggest that wavelet-based kernels slightly outperformed traditional kernels on classi cation reconstruction from abstracts and vice versa from full-text documents, the latter out- come due to word sense ambiguity. The practical imple- mentation of our methodological framework enhances the analysis and representation of speci c knowledge relevant to large-scale digital collections, in this case the thematic coverage of the collections. Representation of speci c knowledge about digital collections is one of the basic elements of the persistent archives and the less studied one (compared to representations of digital ob- jects and collections). Our research is an initial step in this direction developing further the methodological ap- proach and demonstrating that text categorisation can be applied to analyse the thematic coverage in digital repositories.

  • 26.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Forró, László
    Toward Sequencing “Narrative DNA”: Tale Types, Motif Strings and Memetic Pathways2012Conference paper (Other academic)
    Abstract [en]

    The Aarne-Thompson-Uther Tale Type Catalog (ATU) is a bibliographic tool which uses metadata from tale content, called motifs, to define tale types as canonical motif sequences. The motifs themselves are listed in another bibliographic tool, the Aarne-Thompson Motif Index (AaTh). Tale types in ATU are defined in an abstracted fashion and can be processed like a corpus. We analyzed 219 types with 1202 motifs from the “Tales of magic” (types 300-749) segment to exemplify that motif sequences show signs of recombination in the storytelling process. Compared to chromosome mutations in genetics, we offer examples for insertion/deletion, duplication and, possibly, transposition, whereas the sample was not sufficient to find inverted motif strings as well. These initial findings encourage efforts to sequence motif strings like DNA in genetics, attempting to find for instance the longest common motif subsequences in tales. Expressing the network of motif connections by graphs suggests that tale plots as consolidated pathways of content help one memorize culturally engraved messages. We anticipate a connection between such networks and addington’s epigenetic landscape.

  • 27.
    Darányi, Sándor
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Wittek, Peter
    University of Borås, Swedish School of Library and Information Science.
    Kitto, Kirsty
    The Sphynx's new riddle: How to relate the canonical formula of myth to quantum interaction2013Conference paper (Refereed)
    Abstract [en]

    We introduce Claude Lévi Strauss' canonical formula (CF), an attempt to rigorously formalise the general narrative structure of myth. This formula utilises the Klein group as its basis, but a recent work draws attention to its natural quaternion form, which opens up the possibility that it may require a quantum inspired interpretation. We present the CF in a form that can be understood by a non-anthropological audience, using the formalisation of a key myth (that of Adonis) to draw attention to its mathematical structure. The future potential formalisation of mythological structure within a quantum inspired framework is proposed and discussed, with a probabilistic interpretation further generalising the formula.

  • 28.
    Darányi, Sándor
    et al.
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Konstantinidis, K
    CERTH..
    Papadopoulos, S
    CERTH..
    A Potential Surface Underlying Meaning?2015Conference paper (Other academic)
    Abstract [en]

    Machine learning algorithms utilizing gradient descent to identify concepts or more general learnables hint at a so-far ignored possibility, namely that local and global minima represent any vocabulary as a landscape against which evaluation of the results can take place. A simple example to illustrate this idea would be a potential surface underlying gravitation. However, to construct a gravitation-based representation of, e.g., word meaning, only the distance between localized items is a given in the vector space, whereas the equivalents of mass or charge are unknown in semantics. Clearly, the working hypothesis that physical fields could be a useful metaphor to study word and sentence meaning is an option but our current representations are incomplete in this respect.For a starter, consider that an RBF kernel has the capacity to generate a potential surface and hence create the impression of gravity, providing one with distance-based decay of interaction strength, plus a scalar scaling factor for the interaction, but of course no term masses. We are working on an experiment design to change that. Therefore, with certain mechanisms in neural networks that could host such quasi-physical fields, a novel approach to the modeling of mind content seems plausible, subject to scrutiny.Work in progress in another direction of the same idea indicates that by using certain algorithms, already emerged vs. still emerging content is clearly distinguishable, in line with Aristotle’s Metaphysics. The implications are that a model completed by “term mass” or “term charge” would enable the computation of the specific work equivalent of sentences or documents, and that via replacing semantics by other modalities, vector fields of more general symbolic content could exist as well. Also, the perceived hypersurface generated by the dynamics of language use may be a step toward more advanced models, for example addressing the Hamiltonian of expanding semantic systems, or the relationship between reaction paths in quantum chemistry vs. sentence construction by gradient descent.

  • 29.
    Darányi, Sándor
    et al.
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Konstantinidis, Konstantinos
    Papadopoulos, Symeon
    Kontopoulos, Efstratios
    A Physical Metaphor to Study Semantic Drift2016In: Proceedings of SuCCESS-16, 1st International Workshop on Semantic Change & Evolving Semantics, 2016, Vol. 1695Conference paper (Refereed)
    Abstract [en]

    In accessibility tests for digital preservation, over time we experience drifts of localized and labelled content in statistical models of evolving semantics represented as a vector field. This articulates the need to detect, measure, interpret and model outcomes of knowledge dynamics. To this end we employ a high-performance machine learning algorithm for the training of extremely large emergent self-organizing maps for exploratory data analysis. The working hypothesis we present here is that the dynamics of semantic drifts can be modeled on a relaxed version of Newtonian mechanics called social mechanics. By using term distances as a measure of semantic relatedness vs. their PageRank values indicating social importance and applied as variable ‘term mass’, gravitation as a metaphor to express changes in the semantic content of a vector field lends a new perspective for experimentation. From ‘term gravitation’ over time, one can compute its generating potential whose fluctuations manifest modifications in pairwise term similarity vs. social importance, thereby updating Osgood’s semantic differential. The dataset examined is the public catalog metadata of Tate Galleries, London.

  • 30. Declerck, Thierry
    et al.
    Lendvai, Piroska
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Multilingual and Semantic Extension of Folk Tale Catalogues2012Conference paper (Refereed)
    Abstract [en]

    We address the multilingual and semantic upgrades of two digital catalogues of motifs and types in folk-literature: the Thompson’s Motif-Index of Folk-Literature (TMI) and the Aarne-Thompson-Uther classification system (ATU). The methods convert, translate, and represent their digitized content in terms of various (so far often implicit) structural and linguistic components. The results will enable (i) utilizing these resources for semi-automatic analysis and indexing of texts of relevant genres, in a multilingual setting, and (ii) pre-processing the data, for analysing motif sequences in folktale plots. We plan to publish the resulting data, which can be made available in the Linked Open Data (LOD) framework.

  • 31. Dominich, Sándor
    et al.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Szlávik, Z.
    Magyar hiedelmek hierarchikus struktúrája [The hierarchical structure of Hungarian folk beliefs].2006In: Alkalmazott Nyelvtudomány, ISSN 1587-1061, Vol. 6, no 1-2, p. 137-160Article in journal (Refereed)
  • 32. Hedges, Mark
    et al.
    Waddington, Simon
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Maceviciute, Elena
    University of Borås, Swedish School of Library and Information Science.
    Wilson, Tom
    Kompatsiaris, Yiannis
    Dasiopoulu, Stamatia
    Spyroglou, Odysseas
    Ludwig, Jens
    Wieder, Philipp
    Watry, Paul
    Hasan, Adil
    Corubolo, Fabio
    Pinchuk, Rani
    Chanod, Jean-Pierre
    Vion-Dury, Jean-Yves
    Baxter, Rob
    Laurenson, Pip
    Muller, Christian
    PERICLES: Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics2013Conference paper (Other academic)
    Abstract [en]

    This poster paper describes the objectives, approach and use cases of the EC FP7 Integrated Project PERICLES. The project began on 1st February 2013 and runs for four years. The aim is to research and prototype solutions for digital preservation in continually evolving environments including changes in context, semantics and practices. The project addresses use cases focusing on digital art, media and science.

  • 33.
    Kontopoulos, E.
    et al.
    CERTH.
    Corubolo, F.
    University of Liverpool.
    Eggers, A.
    University of Göttingen.
    Ludwig, J.
    University of Göttingen.
    Wieder, P.
    GWDG - Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen.
    Hedges, M.
    King's College London.
    Waddington, S.
    King's College London.
    Chanod, J-P.
    Xerox European Research Centre.
    Vion-Dury, J-Y.
    Xerox European Research Centre.
    Hasan, A.
    University of Liverpool.
    Watry, P.
    University of Liverpool.
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Pinchuk, R.
    SpaceApps.
    Laurenson, P.
    Tate.
    Mueller, C.
    B.USOC.
    Spyroglou, O.
    Dotsoft.
    Kompatsiaris, i.
    CERTH.
    PERICLES EU Integrated Project: Research Strategy and First Results2015In: Proceedings of EU Project Networking Session, 2015Conference paper (Other academic)
  • 34.
    Kontopoulos, E.
    et al.
    CERTH, Thessaloniki, Greece.
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Konstantinidis, K.
    CERTH, Thessaloniki, Greece.
    Riga, M.
    CERTH, Thessaloniki, Greece.
    Mitzias, P.
    CERTH, Thessaloniki, Greece.
    Stavropoulos, T.
    CERTH, Thessaloniki, Greece.
    Andreadis, S.
    CERTH, Thessaloniki, Greece.
    Maronidis, A.
    CERTH, Thessaloniki, Greece.
    Karakostas, A.
    CERTH, Thessaloniki, Greece.
    Tachos, S.
    CERTH, Thessaloniki, Greece.
    Kaltsa, V.
    CERTH, Thessaloniki, Greece.
    Tsagiopoulu, M.
    CERTH, Thessaloniki, Greece.
    Avgerinakis, K.
    CERTH, Thessaloniki, Greece.
    Deliverable 4.5: Context-aware Content Interpretation2016Report (Refereed)
    Abstract [en]

    The current deliverable summarises the work conducted within task T4.5 of WP4, presenting our proposed approaches for contextualised content interpretation, aimed at gaining insightful contextualised views on content semantics. This is achieved through the adoption of appropriate context-aware semantic models developed within the project, and via enriching the semantic descriptions with background knowledge, deriving thus higher level contextualised content interpretations that are closer to human perception and appraisal needs. More specifically, the main contributions of the deliverable are the following: A theoretical framework using physics as a metaphor to develop different models of evolving semantic content. A set of proof-of-concept models for semantic drifts due to field dynamics, introducing two methods to identify quantum-like (QL) patterns in evolving information searching behaviour, and a QL model akin to particle-wave duality for semantic content classification. Integration of two specific tools, Somoclu for drift detection and Ncpol2spda for entanglement detection. An “energetic” hypothesis accounting for contextualized evolving semantic structures over time. A proposed semantic interpretation framework, integrating (a) an ontological inference scheme based on Description Logics (DL), (b) a rule-based reasoning layer built on SPARQL Inference Notation (SPIN), (c) an uncertainty management framework based on non-monotonic logics. A novel scheme for contextualized reasoning on semantic drift, based on LRM dependencies and OWL’s punning mechanism. An implementation of SPIN rules for policy and ecosystem change management, with the adoption of LRM preconditions and impacts. Specific use case scenarios demonstrate the context under development and the efficiency of the approach. Respective open-source implementations and experimental results that validate all the above.All these contributions are tightly interlinked with the other PERICLES work packages: WP2 supplies the use cases and sample datasets for validating our proposed approaches, WP3 provides the models (LRM and Digital Ecosystem models) that form the basis for our semantic representations of content and context, WP5 provides the practical application of the technologies developed to preservation processes, while the tools and algorithms presented in this deliverable can be deployed in combination with test scenarios, which will be part of the WP6 test beds.

  • 35. Kontopoulos, Efstratios
    et al.
    Moysiadis, Theodoros
    Tsagiopoulou, Maria
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Papakonstantinou, Nikos
    Ntoufa, Stavroula
    Meditskos, Georgios
    Stamatopoulos, Kostas
    Kompatsiaris, Ioannis
    Studying the Cohesion Evolution of Genes Related to Chronic Lymphocytic Leukemia Using Semantic Similarity in Gene Ontology and Self-Organizing Maps2016In: Proceedings of SWAT4LS-16, 9th International Conference on Semantic Web Applications and Tools for Life Sciences, 2016Conference paper (Refereed)
    Abstract [en]

    A significant body of work on biomedical text mining is aimed at uncovering meaningful associations between biological entities, including genes. This has the potential to offer new insights for research, uncovering hidden links between genes involved in critical pathways and processes. Recently, high-throughput studies have started to unravel the genetic landscape of chronic lymphocytic leukemia (CLL), the most common adult leukemia. CLL displays remarkable clinical heterogeneity, likely reflecting its underlying biological heterogeneity which, despite all progress, still remains insufficiently characterized and understood. This paper deploys an ontology-based semantic similarity combined with self-organizing maps for studying the temporal evolution of cohesion among CLL-related genes and the extracted information. Three consecutive time periods are considered and groups of genes are derived therein. Our preliminary results indicated that our proposed gene groupings are meaningful and that the temporal dimension indeed impacted the gene cohesion, leaving a lot of room for further promising investigations.

  • 36.
    Kontopoulos, Efstratios
    et al.
    CERTH, Thessaloniki, Greece.
    Riga, Marina
    CERTH, Thessaloniki, Greece.
    Mitzias, P.
    CERTH, Thessaloniki, Greece.
    Andreadis, S.
    CERTH, Thessaloniki, Greece.
    Stavropoulos, T.
    CERTH, Thessaloniki, Greece.
    Konstantinidis, K.
    CERTH, Thessaloniki, Greece.
    Maronidis, A.
    CERTH, Thessaloniki, Greece.
    Karakostas, A.
    CERTH, Thessaloniki, Greece.
    Tachos, S.
    CERTH, Thessaloniki, Greece.
    Kaltsa, V.
    CERTH, Thessaloniki, Greece.
    Tsagiopoulu, M.
    CERTH, Thessaloniki, Greece.
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Gill, A.
    King's College London, UK.
    Tonkin, E. L.
    King's College London, UK.
    Waddington, S.
    King's College London, UK.
    Sauter, Ch.
    King's College London, UK.
    Corubolo, F.
    University of Liverpool, UK.
    PERICLES Deliverable 4.4: Modelling Contextualised Semantics2016Report (Refereed)
    Abstract [en]

    The current deliverable summarises the work conducted within task T4.4 of WP4, presenting our proposed models for semantically representing digital content and its respective context – the latter refers to any information coming from the environment of the digital object (DO) that offers a better insight into the object’s status, its  interrelationships with other content items and information about the object’s context of use. Within PERICLES, we refer to the content semantics enriched with the contextual perspective as “contextualised semantics”. The deliverable presents two complementary modelling approaches, based respectively on (a) ontologies and logics, and, (b) multivariate statistics.

  • 37. Lendvai, Piroska
    et al.
    Declerck, Thierry
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Gervás, Pablo
    Hervás, Raquel
    Malec, Scott
    Peinado, Federico
    Integration of Linguistic Markup into Semantic Models of Folk Narratives: The Fairy Tale Use Case. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)2010Conference paper (Refereed)
    Abstract [en]

    Propp’s influential structural analysis of fairy tales created a powerful schema for representing storylines in terms of character functions, which is directly exploitable for computational semantic analysis, and procedural generation of stories of this genre. We tackle two resources that draw on the Proppian model –, one formalizes it as a semantic markup scheme and the other as an ontology – both lacking linguistic phenomena explicitly represented in them. The need for integrating linguistic information into structured semantic resources is motivated by the emergence of suitable standards that facilitate this, and the benefits such joint representation would create for transdisciplinary research across Digital Humanities, Computational Linguistics, and Artificial Intelligence.

  • 38. Lendvai, Piroska
    et al.
    Declerck, Thierry
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Malec, Scott
    Propp Revisited: Integration of Linguistic Markup into Structured Content Descriptors of Tales2010In: Proceedings of the Conference for Digital Humanities 2010, 2010Conference paper (Refereed)
    Abstract [en]

    Metadata that serve as semantic markup, such as conceptual categories that describe the macrostructure of a plot in terms of actors and their mutual relationships, actions, and their ingredients annotated in folk narratives, are important additional resources of digital humanities research. Traditionally originating in structural analysis, in fairy tales they are called functions (Propp, 1968), whereas in myths – mythemes (Lévi-Strauss, 1955); a related, overarching type of content metadata is a folklore motif (Uther, 2004; Jason, 2000).In his influential study, Propp treated a corpus of tales in Afanas'ev's collection (Afanas'ev, 1945), establishing basic recurrent units of the plot ('functions'), such as Villainy, Liquidation of misfortune, Reward, or Test of Hero, and the combinations and sequences of elements employed to arrange them into moves.1 His aim was to describe the DNAlike structure of the magic tale sub-genre as a novel way to provide comparisons. As a start along the way to developing a story grammar, the Proppian model is relatively straightforward to formalize for computational semantic annotation, analysis, and generation of fairy tales. Our study describes an effort towards creating a comprehensive XML markup of fairy tales following Propp's functions, by an approach that integrates functional text annotation with grammatical markup in order to be used across text types, genres and languages. The Proppian fairy tale Markup Language (PftML) (Malec, 2001) is an annotation scheme that enables narrative function segmentation, based on hierarchically ordered textual content objects. We propose to extend PftML so that the scheme would additionally rely on linguistic information for the segmentation of texts into Proppian functions. Textual variation is an important phenomenon in folklore, it is thus beneficial to explicitly represent linguistic elements in computational resources that draw on this genre; current international initiatives also actively promote and aim to technically facilitate such integrated and standardized linguistic resources. We describe why and how explicit representation of grammatical phenomena in literary models can provide interdisciplinary benefits for the digital humanities research community. In two related fields of activities, we address the above as part of our ongoing activities in the CLARIN2 and AMICUS3 projects. CLARIN aims to contribute to humanities research by creating and recommending effective workflows using natural language processing tools and digital resources in scenarios where text-based research is conducted by humanities or social sciences scholars. AMICUS is interested in motif identification, in order to gain insight into higher-order correlations of functions and other content units in texts from the cultural heritage and scientific discourse domains. We expect significant synergies from their interaction with the PftML prototype.

  • 39.
    Malec, S.
    et al.
    University of Texas (Austin).
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Widdows, D.
    Microsoft Bing.
    Cohen, T.
    University of Texas (Austin).
    Landing Propp in Interaction Space: First Steps Toward Scalable Open Domain Narrative Analysis With Predication-based Semantic Indexing2015Conference paper (Refereed)
    Abstract [en]

    In this paper, we explore the possibility of applying high-dimensionalvector representations of concept-relation-concept triplets, which have been successfullyapplied to model a small set of relationship types in the biomedicaldomain, to the task of modeling folk tales. In doing so, our ultimate aim is todevelop representations of narratives through which their underlying structurecan be compared. The current paper describes our progress toward this aim, withemphasis on addressing the technical challenges involved in moving from therelatively constrained set of relations that have been extracted from biomedicaltext to the much larger set of unnormalized relations that have been extractedfrom the open domain. A toy example using graded vectors demonstrates that ourapproach will be feasible once more material will be added to the test collection.

  • 40.
    Maronidis, A.
    et al.
    CERTH, Thessaloniki, Greece.
    Chatzilari, E.
    CERTH, Thessaloniki, Greece.
    Kontopoulos, E.
    CERTH, Thessaloniki, Greece.
    Nikopoulos, S.
    CERTH, Thessaloniki, Greece.
    Riga, M.
    CERTH, Thessaloniki, Greece.
    Mitzias, P.
    CERTH, Thessaloniki, Greece.
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Gill, A.
    King's College London, UK.
    Tonkin, E.L.
    King's College London, UK.
    De Weerdt, D.
    SpaceApps, Belgium.
    Corubolo, F.
    University of Liverpool, UK.
    Waddington, S.
    King's College London, UK.
    Sauter, Ch.
    King's College London, UK.
    PERICLES Deliverable 4.3: Content Semantics and Use Context Analysis Techniques2016Report (Refereed)
    Abstract [en]

    The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.

  • 41. Meroño Peñuela, Albert
    et al.
    Wittek, Peter
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Visualizing the Drift of Linked Open Data Using Self-Organizing Maps2016In: Proceedings of Drift-a-LOD Workshop at the 20th International Conference on Knowledge Engineering and Knowledge Management, 2016Conference paper (Refereed)
    Abstract [en]

    The urge for evolving the Web into a globally shared dataspace has turned the Linked Open Data (LOD) cloud into a massive platform containing 100 billion machine-readable statements. Several factors hamper a historical study of the evolution of the LOD cloud, and hence forecasting its future: its ever-growing scale, which makes a global analysis difficult; its Web-distributed nature, which challenges the analysis of its data; and the scarcity of regular and time-stamped archival dumps. Recently, a scalable implementation of self-organizing maps (SOM) has been developed to visualize the local topology of high-dimensional data. We use this methodology to address scalability issues, and the Dynamic Linked Data Observatory, a regular biweekly, centralized sample of the LOD cloud, as a time-stamped collection. We visualize the drift of Linked Datasets between 2012 and 2016, finding that datasets with high availability, high vocabulary reuse, and modeling with commonly used terms in the LOD cloud are better traceable across time.

  • 42. Ofek, Nir
    et al.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Rokach, Lior
    Linking Motif Sequences to Tale Types by Machine Learning2013Conference paper (Refereed)
    Abstract [en]

    Abstract units of narrative content called motifs constitute sequences, also known as tale types. However whereas the dependency of tale types on the constituent motifs is clear, the strength of their bond has not been measured this far. Based on the observation that differences between such motif sequences are reminiscent of nucleotide and chromosome mutations in genetics, i.e., constitute “narrative DNA”, we used sequence mining methods from bioinformatics to learn more about the nature of tale types as a corpus. 94% of the Aarne-Thompson-Uther catalogue (2249 tale types in 7050 variants) was listed as individual motif strings based on the Thompson Motif Index, and scanned for similar subsequences. Next, using machine learning algorithms, we built and evaluated a classifier which predicts the tale type of a new motif sequence. Our findings indicate that, due to the size of the available samples, the classification model was best able to predict magic tales, novelles and jokes.

  • 43.
    Pocklington, Michael
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Eggers, Anna-Grit
    University of Borås, Swedish School of Library and Information Science.
    Corubolo, Fabio
    University of Borås, Swedish School of Library and Information Science.
    Hedges, Mark
    University of Borås, Swedish School of Library and Information Science.
    Ludwig, Jens
    University of Borås, Swedish School of Library and Information Science.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    A Biological Perspective on Digital Ecosystems and Digital Preservation2014In: Proceedings of the 11th International Conference on Digital Preservation, 6-10 October, 2014, Melbourne, Australia / [ed] Serena Coates, Ross King, Steve Knight, Christopher Lee, Peter McKinney, Erin O'Meara, David Pearson, 2014, p. 363-365Conference paper (Other academic)
    Abstract [en]

    Successful preservation of Digital Objects (DOs) ultimately demands a solid theoretical framework. Such a framework with a high degree of generality emerges by treating DOs as containers of functional genetic information, exactly as in the genomes of organisms. We observe that functionality links survival in organisms and utility in DOs. In both cases, functional information is identifiable in principle by the consequence of its ablation. In molecular biology, genetic ablations (mutations) and environmental ablations (experimental manipulations) are used to construct interaction maps fully representing organismic activity. The equivalent of such interaction maps are dependency networks for the use of DOs within their Digital Environment (DE). In the poster we will present early work on the application of the theoretical background. It includes first results from a case-study examining a software-based art preservation scenario (SBA) developed as part of the PERICLES FP7 project [1].

  • 44. Szöts, Miklós
    et al.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Alexin, Zoltán
    Vincze, Veronika
    Almási, Attila
    Semantic Processing of a Hungarian Ethnographic Corpus2010In: Proceedings of the 1st International AMICUS Workshop, October 21, 2010, Vienna, Austria, p. 112-115Article in journal (Refereed)
    Abstract [en]

    In this poster, a Hungarian ethnographic database containing linguistic annotation is presented. The corpus contains texts from three domains, namely, folk beliefs, t altos texts and tales. All the possible morphosyntactic analyses assigned to each word and the appropriate one selected from them (based on contextual information) are also marked. Syntactic (dependency) annotation is added semi-automatically to the corpus texts at a second phase of the processing. With the help of these enriched linguistic attributes, the texts can be semantically analyzed and clustered. The research and development team is working on a semantic search tool enabling to browse the texts on the basis of their semantic meaning. The proposed technology may result in a new approach to the ethnographic research and may open a new type of access to the databases.

  • 45.
    Waddington, Simon
    et al.
    King's College London, UK.
    Hedges, Mark
    King's College London, UK.
    Riga, Marina
    CERTH, Thessaloniki, Greece.
    Mitzias, Panagiotis
    CERTH, Thessaloniki, Greece.
    Kontopoulos, Efstratios
    CERTH, Thessaloniki, Greece.
    Kompatsiaris, Ioannis
    CERTH, Thessaloniki, Greece.
    Vion-Dury, Jean-Yves
    XRCE, Grenoble, France.
    Lagos, Nikolaos
    XRCE, Grenoble, France.
    Darányi, Sándor
    University of Borås, Faculty of Librarianship, Information, Education and IT.
    Corubolo, Fabio
    University of Liverpool, UK.
    Muller, Christian
    BUSOC, Belgium.
    McNeill, John
    Tate Galleries, London, UK.
    PERICLES – Digital Preservation through Management of Change in Evolving Ecosystems.2016In: The Success of European Projects Using New Information and Communication Technologies / [ed] Hamriouni, S., Setubal, Portugal, 2016, p. 51-74Conference paper (Refereed)
    Abstract [en]

    Management of change is essential to ensure the long-term reusabilityof digital assets. Change can be brought about in many ways, includingthrough technological, user community and policy factors. Motivated by casestudies in space science and time-based media, we consider the impact ofchange on complex digital objects comprising multiple interdependent entities,such as files, software and documentation. Our approach is based on modellingof digital ecosystems, in which abstract representations are used to assess risksto sustainability and support tasks such as appraisal. The paper is based onwork of the EU FP7 PERICLES project on digital preservation, and presentssome general concepts as well as a description of selected research areas underinvestigation by the project.

  • 46.
    Wittek, Peter
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment.2012Conference paper (Refereed)
    Abstract [en]

    In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, and we achieve a 10x speedup for self-organizing maps over a distributed CPU algorithm.

  • 47.
    Wittek, Peter
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Accelerating Text Mining Workloads in a MapReduce-based Distributed GPU Environment2013In: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 73, no 2, p. 198-206Article in journal (Refereed)
    Abstract [en]

    Scientific computations have been using GPU-enabled computers successfully, often relying on distributed nodes to overcome the limitations of device memory. Only a handful of text mining applications benefit from such infrastructure. Since the initial steps of text mining are typically data intensive, and the ease of deployment of algorithms is an important factor in developing advanced applications, we introduce a flexible, distributed, MapReduce-based text mining workflow that performs I/O-bound operations on CPUs with industry-standard tools and then runs compute-bound operations on GPUs which are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050s attached to each, and we achieve considerable speedups for random projection and self-organizing maps.

  • 48.
    Wittek, Peter
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Digital Preservation in Grids and Clouds: A Middleware Approach2012In: Journal of Grid Computing, ISSN 1570-7873, E-ISSN 1572-9184, Vol. 10, no 1, p. 133-149Article in journal (Refereed)
    Abstract [en]

    Digital preservation is the persistent archiving of digital assets for future access and reuse, irrespective of the underlying platform and software solutions. Existing preservation systems have a strong focus on grids, but the advent of cloud technologies offers an attractive option. We describe a middleware system that enables a flexible choice between a grid and a cloud for ad-hoc computations that arise during the execution of a preservation workflow and also for archiving digital objects. The choice between different infrastructures remains open during the lifecycle of the archive, ensuring a smooth switch between different solutions to accommodate the changing requirements of the organization that needs its digital assets preserved. We also offer insights on the costs, running times, and organizational issues of cloud computing, proving that the cloud alternative is particularly attractive for smaller organizations without access to a grid or with limited IT infrastructure.

  • 49.
    Wittek, Peter
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Introducing Scalable Quantum Approaches in Language Representation2011Conference paper (Refereed)
    Abstract [en]

    High-performance computational resources and distributed systems are crucial for the success of real-world language technology applications. The novel paradigm of general-purpose computing on graphics processors (GPGPU) o ers a feasible and economical alternative: it has already become a common phenomenon in scienti c computation, with many algorithms adapted to the new paradigm. However, applications in language technology do not readily adapt to this approach. Recent advances show the applicability of quantum metaphors in language representation, and many algorithms in quantum mechanics have already been adapted to GPGPU computing. SQUALAR aims to match quantum algorithms with heterogeneous computing to develop new formalisms of information representation for natural language processing in quantum environments.

  • 50.
    Wittek, Peter
    et al.
    University of Borås, Swedish School of Library and Information Science.
    Darányi, Sándor
    University of Borås, Swedish School of Library and Information Science.
    Leveraging on High-Performance Computing and Cloud Technologies in Digital Libraries: A Case Study2011Conference paper (Refereed)
    Abstract [en]

    With the emergence of high-performance computing instances in the cloud, massive scale computations have become available to technically every organization. Digital libraries typically employ a data-intensive infrastructure, but given the resources, advanced services based on data and text mining could be developed. A fundamental issue is the ease of development and integration of such services. We demonstrate the feasibility by providing a case study on a visual machine learning algorithm with MapReduce running in the cloud in a small cluster.

12 1 - 50 of 61
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf