Wednesday, May 4, 2016

 

 



Soy un nuevo usuario

Olvidé mi contraseña

Entrada usuarios

Lógica Matemáticas Astronomía y Astrofísica Física Química Ciencias de la Vida
Ciencias de la Tierra y Espacio Ciencias Agrarias Ciencias Médicas Ciencias Tecnológicas Antropología Demografía
Ciencias Económicas Geografía Historia Ciencias Jurídicas y Derecho Lingüística Pedagogía
Ciencia Política Psicología Artes y Letras Sociología Ética Filosofía


Near-Duplicate Detection Using Instance Level Constraints

1) La descarga del recurso depende de la página de origen
2) Para poder descargar el recurso, es necesario ser usuario
    registrado en Universia


  Descargar recurso   Descargar recurso

Detalles del recurso

Pertenece a: ETD at Indian Institute of Science  

Descripción: For the task of near-duplicate document detection, comparison approaches based on bag-of-words used in information retrieval community are not sufficiently accurate. This work presents novel approach when instance-level constraints are given for documents and it is needed to retrieve them, given new query document for near-duplicate detection. The framework incorporates instance-level constraints and clusters documents into groups using novel clustering approach Grouped Latent Dirichlet Allocation (gLDA). Then distance metric is learned for each cluster using large margin nearest neighbor algorithm and finally ranked documents for given new unknown document using learnt distance metrics. The variety of experimental results on various datasets demonstrate that our clustering method (gLDA with side constraints) performs better than other clustering methods and the overall approach outperforms other near-duplicate detection algorithms.

Autor(es): Patel, Vishal - 

Id.: 54390552

Idioma: English (United States)  - 

Versión: 1.0

Estado: Final

Palabras claveDocument Clustering  -  Artificial Intelligence - 

Tipo de recurso: Thesis  - 

Tipo de Interactividad: Expositivo

Nivel de Interactividad: muy bajo

Audiencia: Estudiante  -  Profesor  -  Autor  - 

Estructura: Atomic

Coste: no

Copyright: sí

Requerimientos técnicos:  Browser: Any - 

Relación: [References] G23536

Fecha de contribución: 10-ago-2011

Contacto:

Localización:


Otros recursos del mismo autor(es)

  1. Determination of galantamine hydrobromide in bulk drug and pharmaceutical dosage form by spectrofluorimetry Aim: To develop a simple, accurate, sensitive, rapid and precise method for the determination of gal...
  2. Using Aggregated, De-Identified Electronic Health Record Data for Multivariate Pharmacosurveillance: A Case Study of Azathioprine
  3. PETALS: Proteomic Evaluation and Topological Analysis of a mutated Locus' Signaling

    Abstract

    Background

    Colon cancer is driven by mutations in a number of genes, the m...


  4. Treatment of severe falciparum malaria: quinine versus artesunate Background: Malaria is the most important disease of human being. More than 40% of the world’s ...
  5. REVIEW ON QUALITY SAFETY AND LEGISLATION FOR HERBAL PRODUCTS In the last few decades, there has been exponential growth in the field of herbal medicine. The grow...

Otros recursos de la misma colección

  1. Mechanism Design For Strategic Crowdsourcing This thesis looks into the economics of crowdsourcing using game theoretic modeling. The art of aggr...
  2. Studies In Automatic Management Of Storage Systems Autonomic management is important in storage systems and the space of autonomics in storage systems ...
  3. Power Efficient Last Level Cache For Chip Multiprocessors The number of processor cores and on-chip cache size has been increasing on chip multiprocessors (CM...
  4. Learning Robust Support Vector Machine Classifiers With Uncertain Observations The central theme of the thesis is to study linear and non linear SVM formulations in the presence o...
  5. Sentiment-Driven Topic Analysis Of Song Lyrics Sentiment Analysis is an area of Computer Science that deals with the impact a document makes on a u...

Valoración de los usuarios

No hay ninguna valoración para este recurso.Sea el primero en valorar este recurso.
 

Busque un recurso