Package sunlabs.brazil.template
Class ContentTemplate
java.lang.Object
sunlabs.brazil.template.Template
sunlabs.brazil.template.ContentTemplate
- All Implemented Interfaces:
TemplateInterface
- Direct Known Subclasses:
ExpContentTemplate
Template class for extracting content out of remote html pages.
This class is used by the TemplateHandler, for extracting
the "content" out of html documents for later integration with
a look-and-feel template using one or more of:
SetTemplate
,
BSLTemplate
,
or
ReplaceFilter
,
The plan is to snag the title and the content, and put them into
request properties. The resultant processed output will be
discarded. The following properties are gathered:
- title
- The document title
- all
- The entire content
- bodyArgs
- The attributes to the body tag, if any
- content
- The body, delimited by content.../content>. The text inside multiple <content> ... </content> pairs are concatenated together.
- script
- All "<script>"..."</script>" tags found in the document head
- scriptSrcs
- A white-space delimited list of all "src" attributes found in "script" tags.
- style
- All "<style">..."</style"> tags found in the document head
- meta-[name]
- Every meta tag "name" and "content"
- link-[rel]
- Every link tag "rel" and "href"
- user-agent
- The origin user agent
- referer
- The user agent referrer (if any)
- last-modified
- The document last modified time (if any) in std format
- content-length
- The document content length, as fetched from the origin server
- prepend
- Prepend this string to the property names define above, that are populated by this template. (defaults to "").
- Version:
- %V% 2.2
- Author:
- Stephen Uhler
-
Field Summary
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionboolean
done
(RewriteContext hr) Extract useful properties out of the http mime headers.boolean
init
(RewriteContext hr) Called before this template processes any tags.void
Grab the "body" attributes, and toss all output to this point.void
toss everything up to and including here, but turn on content accumulation.void
Extract data out of link tags into the properties.void
Extract data out of meta tags into the properties.void
Append all "script" code while in the head section.void
If no content tags are present, use the entire "body" instead.void
Save the content gathered so far, and turn off content accumulation.void
Mark end of head section.void
Gather up the title - no tags allowed between title ....void
Append all "style" code while in the head section.void
Toss everything up to and including this entity.
-
Constructor Details
-
ContentTemplate
public ContentTemplate()
-
-
Method Details
-
init
Description copied from class:Template
Called before this template processes any tags.- Specified by:
init
in interfaceTemplateInterface
- Overrides:
init
in classTemplate
-
tag_title
Toss everything up to and including this entity. -
tag_slash_title
Gather up the title - no tags allowed between title .... /title. -
tag_script
Append all "script" code while in the head section. If the script has a "src" attribute, we'll put the "src" in a variable so the template can deal with it (them?) For now, ignore it. -
tag_style
Append all "style" code while in the head section. -
tag_slash_head
Mark end of head section. All "script" content in the "body" is left alone. -
tag_content
toss everything up to and including here, but turn on content accumulation. -
tag_body
Grab the "body" attributes, and toss all output to this point. -
tag_slash_content
Save the content gathered so far, and turn off content accumulation. -
tag_slash_body
If no content tags are present, use the entire "body" instead. -
tag_meta
Extract data out of meta tags into the properties. For "http-equiv" tags, set the corrosponding http respones header. -
tag_link
Extract data out of link tags into the properties. Prefix the "rel" attribute with "link-" to use as the property name. -
done
Extract useful properties out of the http mime headers.- Specified by:
done
in interfaceTemplateInterface
- Overrides:
done
in classTemplate
-