Wednesday, July 22, 2009

3-Tier Architecture

As the 3-Tier Architecture is the most commonly used architecture in the world, I will start my blogging on this topic.

I have noticed in the past 6 years of writing software that many developers ignore this software engineering paradigm for many reasons. Some include ...

Return on Investment
Software Lifecycle Turnaround time
Knowledgeable Resources
and the list goes on
I will discuss in more details of these reasons in another blog entry so back to the topic of 3-Tier architecture.

A 3-Tier architecture uses the Divide and Conquer strategy and is broken down into 3 logical layers.

Presentation Layer (PL)
Business Logic Layer (BLL)
Data Access Layer (DAL)
Ideally, each layer specializes in one or a handful of functionalities that service the upper layer. Each of the three layers should be designed so that the layer above it does not need to understand nor know the implementation details of any of the layers below it. This is accomplished by providing well defined interfaces that the above layers use. The advantage of "Programming to the Interface" is that, you can change the implementation details and still have the application work as defined. One caveat is that, if the interfaces change, then it will take more effort and time to update the layer above it. Therefore, when designing an application, its important to define the interfaces properly.

Here is an example:

public interface IDatasource
{
Customer GetCustomer(int id)
}

public TextDataSource : IDataSource
{
public Customer GetCustomer(int id)
{
// reads data from a text file
}
}

public SQLDataSource : IDataSource
{
public Customer GetCustomer(int id)
{
// uses ADO.NET to read data from a SQL Server database
}
}

public class MyProgram
{
public static void Main(string[] args)
{
// the DataSourceFactory will create a data source depending on some settings
// and return the appropriate implementation of the data source
IDataSource ds = DataSourceFactory.GetInstance().GetDataSource();
Console.WriteLine(ds.GetCustomer(15).FirstName)
}
}

As you can see from the example, MyProgram does not need to know which datasource it is querying. All it needs to know is the interface it must use to retrieve a customer record. If we have numerous implementations then by changing only the configurations which are declarative and available outside of the compiled code, we can change the datasource the application should use to retrieve data without changing the application logic itself.

Now lets see how we can use the "Programming to the Interface" paradigm to create a 3-Tier architecture.

The Presentation Layer is responsible for rendering the data retrieved by the BLL (with the help of the DAL). The only logic that is necessary in this layer is how to manipulate the data and display it to the user in an easy to consume manner. Along with rendering the content, it should be responsible for rudimentary data validation such as missing fields, regular expression matching for emails and other content, numeric validation, range validations, etc.

In .NET, there are a slew of UI specific controls that one may use to render the data. Some controls include the DataList, DataGrid, Label, TextBox and of course custom controls for the advanced developers. There are also pre-built validation controls that are bundled with the .NET framework. I'll post some links later with examples of how to use these controls soon.

Depending on the application you are building, the presentation layer may be one or more of the following types of applications: web, windows, windows service, smart client or console. By properly defining the responsibilities of each layer, the only logic that is necessary for developers to write is the presentation layer. Since retrieving a customer record is the same throughout the application (through the BLL) you can abstract out all the details hence simplifying your software.

On the same note, your application may also expose Web services and Remoting services so it is essential to centralize the code. Otherwise the logic for retrieving a customer (which may include security authorization and authentication, data validation, pre-processing and post-processing) will need to be duplicated in many places. Code duplication may seem like a viable solution at the early stages of the software lifecycle, but it is extremely hard to maintain such pieces of software.

The Business Logic Layer is like your kernel for your application. It should be the component that performs all the business logic for your application. This ranges from validations, running business logic, running application processes such as sending emails and retrieving and persisting data using the Data Access Layer.

Although, validations were performed on the presentation layer, it is imperative that you revalidate the data because browsers could have been spoofed or older browsers might have completely ignored some of the validations or the developers working on the presentation layer did not validate the data properly.

Depending on the complexity of your application, businesses logic code may not reside on the same server or in a centralized location so there are advanced means of executing such logic remotely. With .NET this process has been extremely simplified and available to you within a few clicks of your mouse. One of which is .NET Remoting which is an advanced topic that I'll opt out for now. And the other is the buzz word that many have heard; Web Services.

Both writing and using a Web Service is once again simplified by Microsoft. Visual Studio 2003 and 2005 will be able to download the WSDL and generate the proxies for you so you can invoke the functions as you may within business object in your application. If you don't have Visual Studio, then you may use the "wsdl.exe" utility that is bundled with .NET Framework on the command prompt.

If you have business logic on legacy systems that were built with Microsoft Technologies such as COM+ that can not be rewritten for what ever reason, not to worry. You can use COM+ wrappers provided by the .NET Framework to communicate with the legacy systems.

The Data Access Layer is responsible for accessing and manipulating data from data sources such as SQL Server, Microsoft Access, Oracle, MySQL, etc. Many applications on the Internet today rely heavily on data found in many databases and it is important to centralize the access to this data. Some reasons are ...

Security
Code Maintenance and Resue
Scalability
Databases contain confidential information about people and it is not necessary for everyone in your organizational hierarchy to have access to such data. For example, credit card information stored on Amazon.com shouldn't be available for an entry level employee working in a warehouse. By centralizing the access the database, we are able to authenticate and authorize the users requesting data and manipulating the data.

Since our economy is constantly in a flux, it is never safe to assume that once you have created your data model, it will not change for a decade. In fact, that data model may change tomorrow, a week from now or in an year, but it will change and as software architects, it is our responsibility to foresee such events and design systems that will be able to change with time. If the code isn't centralized, a database change as simple as adding a new column may result in days of changes to many systems, regression testing and deployment of many applications. Is this really necessary?

By creating simple reusable components, developers are able to abstract out all of the details of creating connections, handling errors, invoking appropriate stored procedures or executing Transact-SQL or SQL/PL code, retrieving the data, closing the connection away from the Business Logic Layer.

Typical code (using ADO.NET) to retrieve a customer record may look as the following:

SqlConnection connection = null;

try
{
connection = new SqlConnection(mySqlConnection);
SqlCommand cmd = connection.CreateCommand();
cmd.CommandText = "SELECT * FROM Customers WHERE CustomerID = " + cid
return CustomerFactory.GetInstance().GetCustomer(cmd.ExecuteReader());
}
catch (Exception e)
{
Console.WriteLine("Exception :: " + e.Message);
}
finally
{
if (connection != null && connection.State != ConnectionState.Closed)
connection.Close();
}

So what's the big deal about writing a few lines of code? Well, imagine repeating these lines in 5 different places and 2 weeks later, making a change and remembering all the pieces of code to change. Compared to this solution, I suggest the following:

Customer customer = CustomerDAO.GetInstance().GetCustomer(customerID);

Where the ADO.NET code is abstracted away by the GetCustomer(int) function and now making a change to the function will take minutes and the change will affect all pieces of code that depends on retrieving customer records. The above examples use Design Patterns, more specifically the Factory Pattern and the Singleton Pattern and you may further read about them at your leisure.

Scalability is huge to enterprise level applications that require time crucial data. It’s beyond the scope of this article so I will leave it out for the time being.

So there you have it.

No comments: