1
0
mirror of https://github.com/moparisthebest/wallabag synced 2024-12-26 01:09:19 -05:00
wallabag/inc/3rdparty/site_config/standard/.about.com.txt
Siôn Le Roux d59536deea Add support for *.about.com
Includes next_page_link for multi-page articles and strips pesky in-line
'next' links from the article body. Also includes an Xpath for author
but I can't see where this is used in the wallabag UI.

The 'tidy' option is turned off because it messed up bulleted lists.

Tested with psychology.about.com and food.about.com.
2014-07-11 00:04:24 +02:00

15 lines
336 B
Plaintext

body: //div[@id='articlebody']
title: //h1
author: //p[@id='by']//a
next_page_link: //span[@class='next']/a
# Not the same as below!
prune: yes
tidy: no
# Annoying 'next' links plainly inside the article body
strip: //*[text()[contains(.,'Next: ')]]
test_url: http://psychology.about.com/od/theoriesofpersonality/ss/defensemech.htm