Form2WSDL Project Roadmap
Itinerary
Run as a Perl script, output some sample data to test the setup. 11/10/2004
Fetch the page inputted by the user using LWP::UserAgent. Put this HTML into a variable and output it. 27/10/2004
Pipe this code through an external program (most likely HTML Tidy) to ensure no validation errors. 4/11/2004
Split the page into discrete components, hopefully through a pre-existing Perl module. Dump everything outside form
elements. Maybe store metadata about the fetched page for the report at the end. 15/11/2004
Load the form elements into an array.Take each one and make the necessary arrangements to convert the text and hidden elements into “equivalent” WSDL descriptions. 20/02/2005
Test. Then go through the same process for the more difficult form elements — select
and radio
. 26/02/2005
Write some test cases, package these with the program.
Develop regex algorithms for guessing at useful parameter types.
Large scale testing on real world examples.
Current Outstanding Issues
HTML::Form doesn’t extract the name
or id
of a form, which we could use in preference to referring to operations as ‘FormNameN
’ We’ll have to do some full HTML parsing to get at these.
Operations usually have names that are readable to humans, but HTML forms don’t have any text that would seem to fit.
A form may have two submit buttons, with name
attributes to differentiate them. These should be treated as two separate instances of the form.
Icing
Allow referrer passing, like Form2WSDL.pl?url=referrer. Also allow ‘referer’ 30/11/2004
Respect robots.txt
. (More than likely taken care of by LWP::UserAgent, test).
Perform Content-Type check to stop the converter in its tracks if it receives what looks like e.g. an image or pdf. Specific error message. 2/11/2004
- Develop bullet-proof URL validation regular expression. Garnish with client-side JavaScript form validation.
Deal with character sets (hopefully not).
- Allow upload of local file for conversion.
Follow redirects and notify user if URL examined is different from request URI. 22/11/2004
Follow meta refresh
redirects. 8/02/2005