e-space
Manchester Metropolitan University's Research Repository

    Exploring Sketches for Probability Estimation with Sublinear Memory

    Kleerekoper, Anthony, Lujan, Mikel and Brown, Gavin (2013) Exploring Sketches for Probability Estimation with Sublinear Memory. In: 2013 IEEE International Conference on Big Data, 06 October 2013 - 09 October 2013, Silicon Valley, CA, USA.

    [img]
    Preview
    Accepted Version
    Available under License In Copyright.

    Download (382kB) | Preview

    Abstract

    As data sets become ever larger it becomes increasingly complex to apply traditional machine learning techniques to them. Feature selection can greatly reduce the computational requirements of machine learning but it too can be memory intensive. In this paper we explore the use of succinct data structures called sketches for probability estimation as a component of information theoretic feature selection. These data structures are sublinear in the number of items but were designed only for estimating the frequency of the most frequent items. To the best of our knowledge this is the first time they have been examined for estimating the frequency of all items and we find that often some information theoretic measures can be estimated to within a few percent of the correct values.

    Impact and Reach

    Statistics

    Activity Overview
    6 month trend
    451Downloads
    6 month trend
    298Hits

    Additional statistics for this dataset are available via IRStats2.

    Altmetric

    Repository staff only

    Edit record Edit record