1) La descarga del recurso depende de la página de origen
2) Para poder descargar el recurso, es necesario ser usuario registrado en Universia


Opción 1: Descargar recurso

Opción 2: Descargar recurso

Detalles del recurso

Descripción

For the task of near-duplicate document detection, comparison approaches based on bag-of-words used in information retrieval community are not sufficiently accurate. This work presents novel approach when instance-level constraints are given for documents and it is needed to retrieve them, given new query document for near-duplicate detection. The framework incorporates instance-level constraints and clusters documents into groups using novel clustering approach Grouped Latent Dirichlet Allocation (gLDA). Then distance metric is learned for each cluster using large margin nearest neighbor algorithm and finally ranked documents for given new unknown document using learnt distance metrics. The variety of experimental results on various datasets demonstrate that our clustering method (gLDA with side constraints) performs better than other clustering methods and the overall approach outperforms other near-duplicate detection algorithms.

Pertenece a

ETD at Indian Institute of Science  

Autor(es)

Patel, Vishal - 

Id.: 54390552

Idioma: inglés (Estados Unidos)  - 

Versión: 1.0

Estado: Final

Palabras claveDocument Clustering  -  Artificial Intelligence - 

Tipo de recurso: Thesis  - 

Tipo de Interactividad: Expositivo

Nivel de Interactividad: muy bajo

Audiencia: Estudiante  -  Profesor  -  Autor  - 

Estructura: Atomic

Coste: no

Copyright: sí

Requerimientos técnicos:  Browser: Any - 

Relación: [References] G23536

Fecha de contribución: 10-ago-2011

Contacto:

Localización:

Otros recursos del mismo autor(es)

  1. Not all right-sided hearts are the same—the importance of identifying the correct diagnosis Scimitar syndrome is characterized by an anomalous venous return with the characteristic chest roent...
  2. Multivariate metabotyping of plasma predicts survival in patients with decompensated cirrhosis
  3. Quality of life among patients after bilateral prophylactic mastectomy: A systematic review of patient reported outcomes
  4. MCUR1 Is a Scaffold Factor for the MCU Complex Function and Promotes Mitochondrial Bioenergetics Mitochondrial Ca2+ Uniporter (MCU)-dependent mitochondrial Ca2+ uptake is the primary mechanism for ...
  5. The Cytoplasmic Prolyl-tRNA Synthetase of the Malaria Parasite is a Dual-Stage Target for Drug Development The emergence of drug resistance is a major limitation of current antimalarials. The discovery of ne...

Otros recursos de la misma colección

  1. Weighted Average Based Clock Synchronization Protocols For Wireless Sensor Networks Wireless Sensor Networks (WSNs) consist of a large number of resource constrained sensor nodes equip...
  2. Mechanism Design For Strategic Crowdsourcing This thesis looks into the economics of crowdsourcing using game theoretic modeling. The art of aggr...
  3. Studies In Automatic Management Of Storage Systems Autonomic management is important in storage systems and the space of autonomics in storage systems ...
  4. Power Efficient Last Level Cache For Chip Multiprocessors The number of processor cores and on-chip cache size has been increasing on chip multiprocessors (CM...
  5. Learning Robust Support Vector Machine Classifiers With Uncertain Observations The central theme of the thesis is to study linear and non linear SVM formulations in the presence o...

Aviso de cookies: Usamos cookies propias y de terceros para mejorar nuestros servicios, para análisis estadístico y para mostrarle publicidad. Si continua navegando consideramos que acepta su uso en los términos establecidos en la Política de cookies.