Victor

PID

Victor is a web page cleaning tool. It is aimed at removing menu, ads, footers, headers, etc. from HTML web pages, so that only main web page content remains. Victor is based on a conditional random fields algorithm.

Identifier
PID http://hdl.handle.net/11858/00-097C-0000-0001-48FD-B
Related Identifier http://ufal.mff.cuni.cz/victor/
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11858/00-097C-0000-0001-48FD-B
Provenance
Creator Marek, Michal
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2009
Rights GNU General Public License, version 2; http://www.gnu.org/licenses/gpl-2.0.html; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Resource Type toolService
Format application/x-bzip2; application/octet-stream; downloadable_files_count: 1
Discipline Linguistics