Monthly Archive for January, 2009

Iterator Pattern Example: Developing a Webpage Scraper

A couple of readers suggested that a full-fledged example might be a good follow-up to my previous post introducing the iterator pattern. This is a good suggestion as there are few meaningful examples of the iterator pattern that demonstrate its intent and usefulness. This is probably due to the iterator pattern being a built-in construct in most programming languages. However, built-in iterators are designed to  traverse  native collections such as Arrays and Dictionaries. To traverse a custom data structure, we need to develop an iterator from the ground-up. In this example, we will develop a webpage scraper, like  Googlebot, that recursively harvests information from web pages.

Why is an iterator pattern a good candidate for use in developing a webpage scraper? As described in my previous post, the iterator pattern provides a uniform way to traverse and access elements in a collection. A web page is a collection of elements. To harvest the elements ( tags ) we need to traverse and access the elements in the collection (HTML). The iterator pattern light bulb should go off at this point.

Having a uniform interface to access different elements in the web page is very desirable. Why? because there are multiple ways to traverse and access different elements. We can develop several concrete iterators to access different tags. In this example, we will develop two concrete iterators: one to access hyperlinks and the other to access images.

The example will be developed in two parts. My initial novice attempt will be described first ( I’ll call this version 1 ). I initially treated a web page as an XML document, so that E4X could be used to traverse and identify elements. However, this introduced a major limitation in that only well-formed web pages could be scraped.  This didn’t mean that the web pages had to be declared as XHTML Strict per se, but each page had to be structured according to the rules defined in Section 2.1 of the XML 1.0 Recommendation. So, any malformed web pages with missing closing tags, or funky characters would fail the test. In my second attempt (version 2), I treated web pages as text documents and used regular expressions to identify elements. This introduced another more serious limitation in that my knowledge of regular expression pattern matching was minimal. So, version 2 was more of an adventure in slaying the regular expression dragon than anything else. However, the utility of the iterator was amply demonstrated as I could extend the scraper app to meet the new reqruiement without changing any existing code – the ultimate test in reusability. Here is the initial class diagram.

Webpage scraper – version 1

Class diagram of web scraper example

Class diagram of version 1

Continue reading ‘Iterator Pattern Example: Developing a Webpage Scraper’

  • Share/Bookmark

Where’s the Real World?: Design Pattern Examples in ActionScript 3.0

Gentle Readers: This short post is a request for feedback. The whole issue of appropriate level examples both in our books and this blog is an important one because it speaks to the utility of the writings and posts. So, your thoughts are not only welcomed; they’re essential.

I had a meeting with a computer scientist who was teaching a game class. In our short chat, he must have used the term real world a dozen times. Well, I’m all for the real world (in contrast to the unreal world of unicorns, fairy dust, and honest politicians). However, the real world for one is not the same as it is for another. Recently, I got a comment about a blog entry thanking us for a solution to a practical problem that one of our readers encountered in programming. The same post was criticized by another reader as not being real world. Therein lies the dilemma.

Abstract vs. Concrete

Chandima and I use a range of examples in our book. We start with an abstract minimalist example so that the reader can see the participants in a design pattern and then move on to something more concrete to illustrate a practical application. On this blog, most of the examples start with the more abstract elements and move into a fairly general (somewhat abstract) example that is more practical. The more abstract an example, the more general its applicability—not unlike an abstract class. However, the more concrete an example, the better the reader can use it to model a like problem in an eminently practical way. Each has its benefits. The abstract examples have generalizability and the concrete examples have needed detail.

Were I to do all of my examples using real world examples that I deal with, most would involve streaming video and Flash Media Server. My customers usually approach me for just that kind of problem. Obviously,using streaming video and FMS is real world, but its not very generalizable. Likewise, some readers complain that the abstract examples don’t help because they’re not practical.

We’d like your thoughts on this issue. Obviously, the most useful examples would be those that you deal with directly in your work, but like my practical work, it’s pretty narrow. Keeping these concerns in mind, tell us what’s most helpful to you.

  • Share/Bookmark

ActionScript 3.0 Abstract Factory Design Pattern: Multiple Products and Factories

This is one of the few design patterns that I worked up directly from the class diagram and from concepts in GoF. Normally, I like to look at some examples, done in Java or C#, but not this time. As you will see in Figure 1, the pattern appears to be fairly daunting, but I found it to be eminently practical, and it seemed to be a direct response to questions that I had about the Factory Method design pattern (See Chapter 2 for an in-depth explanation of the Factory Method.) You can download the entire example here before continuing if you wish.

 Let me start with the gist of the example from GoF and provide something more concrete that’s likely to be a typical kind of issue Flash and Flex developers deal with. Imagine a project where your designers have created general templates for a business site and another for a game site. Their templates include a SWF background and a set of buttons for a UI. The buttons are wholly programmed and require nothing in the Library, and so using them for either Flash or Flex is fairly simple.

 You want to keep your design loose, and so you decide that a factory will be helpful. However, clearly you will need a factory to create instances of both buttons and the background template. Further, you want your products to derive from an abstract class to give you as much flexibility as possible. In the example here, you will need an abstract product for buttons and another for backgrounds. You also want your factory abstract enough to make requests for sets of objects from the different products. For example, you want your factory to deliver both a set of buttons and a background that are matching pairs. You don’t want a set of buttons for a game site with a background for a business site, but rather you want the buttons to match your background—business buttons with a business background and game buttons with a game background. This is a job for the Abstract Factory.

 Figure 1 shows the class diagram. In looking at the “create” lines (dashed lines), think of them as working with matched sets. The Client requests a business set; and it gets both a business product for buttons and another product for background. So while the diagram may look busy, it really is doing something that makes sense on a basic level. That is, the design is geared to sets; of products with factories that create the requested sets rather than individual objects.

abfactory66

 

Figure 1: Abstract Factory Class Diagram
Note that Figure 1 shows that both concrete factories create instances from each of the child classes of the two abstract product classes. You can very quickly see the practicality of this when you substitute some concrete elements for the more general conceptual names.
Continue reading ‘ActionScript 3.0 Abstract Factory Design Pattern: Multiple Products and Factories’

  • Share/Bookmark

Take a Design Pattern to Work Part IV: Establishing a Design Pattern Foundation

Gentle Reader: This is Part 4 of a four-part series of posts on introducing design patterns and OOP into the work place. Parts 1 through 3 will provide the context for this part. Also, taking a look at No Time for OOP and Design Patterns will give you the background on this series. As always, we invite your comments.

Note: Chandima wrote the chapter in our book on the Factory Method, and he gave me invaluable help on the main program in this post as well.

Recap

Up to this point we’ve examined a simple program that loads external text and graphics, a common ActionScript chore. In the most general terms, this is where we’ve been:

  • Part I: Identifying the problem in a current solution. Why ActionScript on the Timeline can cause problems.
  • Part II: Providing a simple OOP solution: Use of Inheritance
  • Part III: Loosening Up a programs structure: Adding a design pattern element —a simple factory

To conclude the process, we now come to the last part—introducing an actual design pattern to the work place.

  • Part IV: Establishing a Design Pattern Foundation.

Given the preceding steps, the context is now in place to add a full design pattern.

From Part to Whole

Part III introduced the Simple Factory method inserted into an existing OOP program. Now it’s time to step back and look at a design pattern en toto and instead of incrementally adding to the existing program, we will refactor the whole kit-n-kaboodle from the perspective of a design pattern.

To get started, if you’re not familiar with the Factory Method pattern, take a look at Chapter 2. In fact Chandima’s Sprite Factory example beginning on page 84 is one of the clearest and most appropriate examples that you can find of the Factory Method pattern in ActionScript 3.0. So before continuing, you might want to do a quick review of the Factory Method and take a look at Figure 1, the class diagram for the pattern. (We’ll wait for you…).

factorymethoddp852

Figure 1: Factory Method Design Pattern

As you can see, the Factory Method (simple factory) is part of the Creator interface and the ConcreteCreator. The interface is an abstract class; so at least one of the methods needs to be abstract—impossible to directly instantiate but easily overridden in a child class.
Continue reading ‘Take a Design Pattern to Work Part IV: Establishing a Design Pattern Foundation’

  • Share/Bookmark

Abstract is as Abstract Does: A Forrest Gump Approach to Abstract Classes in ActionScript 3.0

Here it is 2009, and the discussions about the lack of abstract classes in ActionScript 3.0 go back at least to 2006. Chandima’s 2007 post nicely deals with a number of issues concerning abstract classes, and I’m not going to try and improve on that. Instead, in this short post, I want to bring up two simple issues. Acting abstractly and what use is there for the override statement if not for abstract classes?

Abstract is as Abstract Does

In barreling along in my Take a Design Pattern to Work [Part III] series, one of the classes, Staff, works as a nice solution for creating lots of child classes with just a couple lines of code (the child has just a few lines, not the parent.) The Staff class is extended but not instantitated, just like an abstract class. Since we have no abstract modifier in ActionScript 3.0 anyway, what is the difference between a class we treat in exactly the same way as an abstract class as far as not instantiating it but only extending it and a real abstract class in the ActionScript 3.0 sense?

Override for What?

Maybe it’s me, but the only time I use the override statement is when I’m creating a function derived from an ersatz ActionScript 3.0 abstract class. So my question is, why give us an override statement but not give us an abstract class modifier? Does anyone use override for anything other than changing a function derived from a parent class so as to make the parent act like an abstract class?

Interfaces: Interface and Abstract Classes

When you look at the design patterns in GoF and elsewhere, the term interface can refer to either abstract classes or an interface—the interface statement in ActionScript 3.0. In design patterns interfaces are important because they aid in keeping participants loose but connected. Given the goal of looseness, does a Forrest Gump defined abstract class (extended but not instantiated) serve that goal?

Comments invited.

  • Share/Bookmark