-

Agile Methodology vs classical development

Please do realize that for the sake of brevity, I am oversimplifying at least half a dozen techniques (such as extreme programming , SCRUM or Feature Driven Development)  about which books have been written to a bullet point list... This article is more meant as a way to create an interest. You can read the manifesto in full at http://agilemanifesto.org/ .   Traditionally, software was written in a big cycle that started with planning, continued to implementation, testing, deployment of all features at once and then the software development was considered done. The feature set was decided in advance and in that context moving one tiny bit from that feature set was frowned upon. There was no requirement to have intermediary release and feedback came only after the application was finished. Agile methodology is en vogue and it really offers interesting concepts to programming, namely:
  • continuous testing of the application/ automated regression testing
  • only implement the features that the client need
  • work in short monthly cycles
  • actual features/working code over documentation
  • always have a shippable version (especially at the end of each cycle)
  • always keep a prioritized list of features and implement the first feature on the list
  • develop a reliable method of estimating (like planning games)
  • no plan is set in stone, you adapt to the client as you go along.
The idea stems from the fact that often, the developer ends up knows about implementing the product only after having done the whole process a first time and by using this method, you get to optimize what works as you go along. You also have the client more productive immediately by rolling out updates as you go along. Further, you can rectify wrong turns as you go along because you get constant feedback from the customer... indeed, this is a particularly seductive method. Ideally all is well in wonderland. However, keep in mind that with this method, it is very easy to go astray from the original development goals... which is fine as long as the client goes along with the direction the project is taking. It is ideal in case of inhouse development. Developers must be careful when using Agile methods while on a contract because the client can always go, "this is not what we originally convened". For this cases, it is good to put in place processes to track the changes and client validation. In my humble opinion, it is also good to have a set of general guidelines, a direction to get to in mind that does stay constant or you may end up with tons of nice features that are fully functional due to the nature of agile method but not necessarily as useful. Further, Agile is criticized by some as a way to simplify programming... it is not if done right. Indeed, by having constant testing, constant shipping versions and progressive deployment, I would argue that more complexity is introduced and in the end it does bring about a positive change... ie better code and avoiding of costly mistakes. One thing is certain, spend time on picking a methodology to formally structure your development, especially in group development. Also one method may work best for one project and slow down the development of another... be agile, even about your development method Happy programming

General Tips

This is a section that will contain tips that are applicable to more than one platform and related aspects of Web Development such as good practices and things like that. [sb_child_list]

ami ec2 quick LAMP install

This is what I run when I want to quickly create an AWS instance running apache, mysql and php:
yum install -y php-mysql php-xml php-mcrypt php-mbstring php-gd yum install -y php-pear yum install -y mysql-server yum install -y php-devel yum install php-pecl-apc yum install -y php-pear php-devel httpd-devel chkconfig --levels 235 httpd on chkconfig --levels 235 mysqld on service mysqld start service httpd start echo max_allowed_packet=16M>>/etc/my.cnf echo delayed_insert_timeout=10000>>/etc/my.cnf echo connect_timeout=100000>>/etc/my.cnf
Note: if you need pretty url, make sure that mod_rewrite is enabled (Enabled by default on centos and Ubuntu: a2enmod mod_rewrite). Then make sure the httpd config /etc/httpd/conf/httpd.conf allows overrides AllowOverride all) Then, if I need drush, I install it with the following 2 lines:

pear channel-discover pear.drush.org pear install drush/drush

Installing an ssl host

in /etc/httpd.conf <virtualhost *:443> ServerName secure.webdevpower.com ServerAdmin webmaster@webdevpower.com SSLEngine On SSLCertificateFile /etc/certs/webdevpower.com.crt SSLCertificateKeyFile /etc/certs/webdevpower.key SSLCertificateChainFile /etc/certs/gd_bundle.crt Options FollowSymLinks DirectoryIndex index.html index.php DocumentRoot "/var/www/html/secure" </virtualhost>

Create linux swap file

Adding a swap file helps with performance, especially on machines with low memory. Here is my way to create one: dd if=/dev/zero of=/swapfile1 bs=1024 count=4524288 #df -k mkswap /swapfile1 chown root:root /swapfile1 swapon /swapfile1 echo '/swapfile1 swap swap defaults 0 0' >> /etc/fstab

useful shell one liners

The following are a few one liner shell commands I occasionally use:
  • Get the current external ip address: GET checkip.dyndns.org | tr -d ' /:-z'
  • Get an estimation of postfix mail queue: mailq|wc -l
  • Find out if a process is running from name: ps uxa|grep -i processname
  • Get a list of directories sorted by size: du -s -h *|sort -h
  • ping a list of ips: cat file-of-ips | xargs -n 1 -I ^ -P 50 ping ^
  • ping a list of hosts in a file: cat hosts|xargs -n1 ping -c 1

Node.JS+MongoDB on CentOS

Another selling point is that if you use node.js with a database server like mongodb, it allows to code the whole web application in Javascript. Though most developers have a love/hate relationship with Javascript that varies depending on the day, it simplifies code writing by using javascript all the way from the Server side to the client Side. Because it is relatively recent, it does not run out of the box on Centos Installs. Here are the steps I used:
#Install the EPEL/ IUS repositories from http://iuscommunity.org/ wget http://dl.iuscommunity.org/pub/ius/stable/CentOS/6/x86_64/epel-release-6-5.noarch.rpm wget http://dl.iuscommunity.org/pub/ius/stable/CentOS/6/x86_64/ius-release-1.0-11.ius.centos6.noarch.rpm yum localinstall epel-release-6-5.noarch.rpm yum localinstall ius-release-1.0-11.ius.centos6.noarch.rpm #Install Mongo DB: yum install mongodb-server #install build tools yum install openssl-devel gcc-c++ gcc #Download node.js wget http://nodejs.org/dist/node-latest.tar.gz tar xzvf node-latest.tar.gz #change to node directory - version 0.10.5 as of this writing #your actual directory name may be slightly different cd node-v0.10.5 #build node ./configure make make install

Use version control

Source control systems like GIT are targeted at helping developer teams work together and indeed, they allow massive productivity gains in these scenarios. However, it is important to note that even working solo, source control systems like git can be extremely useful. Here are some of the key benefits
  • Source code history - no matter how good you are in development, there are times where you need to go back to a previous state. By incrementally commiting and commiting often, you have multiple stages to which you can revert and this can be done fast
  • Multiple branches for multiple features - you can work on different features in different branches. Experiment all you want and even compare out easily different versions of the same code by alternating between branches.
  • Auto push to production server- it is possible to push code to the production server easily when you feel there is a positive change. Not only the process can be made easier but because it is less of a hassle, you will be encouraged to commit more often. Further, when changes are made on the server, it is easier to incorporate them back transparently. It is also less risky because you can always revert back
  • It helps with documentation - if you add a meaningful message with every commit, you can track what you have done more easily and remember later
  • You can get a feel for progress - not only you get a list of commits but many benchmarks can be used to feel how you are doing (#of commits)
  • Custom code - many times code needs to be developed for a custom client. Other times, you are extending an existing open source project... the custom code can be merged easily
  • Better disaster recovery -because it is so easy to commit to different servers, there is no central code repository... at every commit to a remote repository, a backup is made. Even if the hard drive dies... either on the server or on the development machine, instant recovery can be made
Of course, when you are in a team, it helps even more
  • Code sharing - everyone can get the latest code. they also can check out each other's branches and help each other without necessarily messing up their own code or having to be on your server
  • Conflict resolution - Sometimes two people will change the same code and a potential conflict will be introduced... the software can flag some of these changes and prompt you to merge everything or only partially.
  • team awareness - if a developer is not available, another can always pick up in case of emergency, knowing what was changed and the file's history
  • robustness - the application ends up more robust because it is integrated more often and conflicts are identified early.
  • increased productivity - when working individualy, team will often develop the same code either knowingly because they need a working version or unknowingly because they do not know what others have done. Working in this way, not only this is avoided but a fresh pair of eyes on each other's code can make things more robust.
  • Flexibility in code sharing - Since there can be more than one repository. It is easy to imagine a tree of repositories where code is shared among the team and merges are done only when needed. Sensitive parts can be kept out of the main repository while allowing collaboration among the people who need it
  • Better problem handling - When working in a team, many problems can arise. Whether it is the rare malicious intent or it is simply forgetting who did what, it is a lot easier if you can pinpoint the origin of a particular change. This can be used in a positive way too: if someone comes out with a really clever fix, it is easier to know who to praise.

Optimize performance

You may think your site is too small for performance optimization... it is never true... with the nature of the web, it is always good to be ready for any inflow of visits after you get linked from a popular website. Further, the more responsive your site is, the more people will appreciate their visit (including you). A nonresponsive site will take more time to be indexed, more likely to be forgotten by users who will be less likely to come back. As a matter of fact, google did a study and if the web page takes more than 5 seconds to load, 25% people will leave the page... so if your page is taking too long to load, you are losing people already.

Optimization tip: Optimize database requests

Relational databases rock when it comes to extracting data but they are very resource hungry. Here are a few things you can do
  • Indices - I have seen cases where adding a few indices has turned a site crawling on its knees to a fast responsive web site
  • cache queries in computed form/ ready to serve form
  • examine the possibility of using flat files or non relational databases like noSQL... especially if data is read only with simple format and query rarely changes
  • consider grouping multiple insert statements into one if your database supports it
  • consider optimizing queries - if for example you do not need to have a join, do not include it
  • consider using transactions when multiple database requests... all transactions sent at once. I once cut down the importing of a huge database by at least one order of magnitude by encapsulating multiple queries into a transaction
  • consider database replication - sometimes by having more than one server that replicate, you can optimize performance. This works especially well in intranets when users are grouped in a limited amount of sites.
  • consider stored procedures... they will increase load on the database server but decrease data sent back and forth between application and db server. also, it happens on the server so, happens on the raw data.

Optimization tip: Consider a daemon process

Sometimes you need to access data or resources with a long init process. By offloading this to a daemon that stays in memory and optimizes calls, some long processes can make the query instantaneous.

Optimization tip:Consider a cron job

Some tasks such as caching, prefetching data can be offloaded to a cron job that would run at regular intervals, preferably to offpeak times

Optimization tip: Consider using specialized servers

Search engines are a good example. Some machines will crawl, others will index the data and others will serve the data... If you have limited web power resources such as a VPS. Consider running some processes at home where having a powerful machine is free for you.

Optimization tip: Optimize code

  • Consider JIT compiler or compiler for interpreted languages. Compilers such as hiphop for php allow huge performance savings and decreasing memory footprint. Sometimes it may make sense to rewrite one part of code into a lower level language such as C
  • Delay framework usage for simple stuff... frameworks allow very cool things such as MVC and simplify code reuse but they have a performance cost
  • See if you can optimize the algorithms to use less resources.  There are times where object oriented is more appropriate and allows to swap in/ swap out new algorithms on the fly
  • Prune out unused code regularly
  • make sure there are no race conditions and identify places where you can speed up function calls. Once, an application was very slow simply because it repeatedly queried the same hostnames multiple times. By identifying duplicate calls and reusing results, the site was much faster
  • Check if network factors such as latencies, error rates need improvement (Once a whole site was brought down by a faulty ethernet cable)

Optimization tip: Load balancing

  • consider load balancing if necessary - When more traffic comes, sometimes it makes sence to offload the work to more than one server. Many situations exist, among others cluster and round robin solutions. Having more than one server available has the added benefit of allowing for failsafe scenarios if one server goes down and also scaling more easily by adding servers as needed.

Optimization tip: cache content

RSS feeds, databases and fetched data like social network interaction can bring lots of power to your site... but it can really slow your website. Make sure the cache refreshes enough yet refresh as little as possible (in some cases data can be cached for days). Either in a simple way or through specialized processes like memcache. This particularly holds true for anonymous access of the landing page which often contains non user specific content.

Optimization tip: Alternative servers

servers nginx like nginx can instantly boost your performance (though to be fair apache has done efforts to catch up). Consider alternatives to all server processes.

Optimization tip:  consider different/ simpler programs

Some great software is out which can do tons of stuff... yet at the cost of added resources. If all you need is a picture gallery, consider a simple gallery program and not a big complicated CMS that will have a picture feature

Optimization tip: consider static pages

Many times, dynamic pages are necessary... but sometimes, a simple html file will do, maybe offloading the dynamic part to a script

Optimization tip: Separate Static content

It is good to host static content on a separate server. Even if your server is under heavy load, the images/ css will load faster faster

Optimization tip: Only run the services needed

On development machines, it is tempting to enable tons of services that you will not use for fun and wow factor... on production machines, run only what is needed. Not only will your server be more responsive but it makes your installation more secure with less security holes.

Optimization tip: optimize images, css, preload

  • Order of HTML matters, make sure you load the important parts first in the page.
  • If many images, preload images if possible, it will make your site feel more responsive
  • if you can compress the css and/or optimize it it is good
  • consider offloading to javascript client side for more responsive UI (thinks like validation, dependent combos, auto complete)
  • consider reloading partially page with AJAX... more responsive, less traffic
  • with complex javascript, make sure your javascript ui remains responsive, only do the needed stuff upfront or page may become  un responsive
  • If possible, avoid multiple copies of the same file (image, music, etc) across pages... consumes more bandwidth, makes site less responsive and makes maintainance harder
  • If your page will take a long time, make sure at least something displays
Good luck in making sites more responsive

Building your reputation

When you work as a consultant, especially if you are independent, it is good to look at the long run so a few things can help:

  • Have a methodology, do your homework. If you come in for a meeting, come with questions to ask, maybe a mock screen of what you think it will look like. It will encourage discussion
  • Ask for reference/testimonial in writing after a work is well done...  it may be hard to keep in touch of the person years later and it is always handy. IF the customer is happy, they will be happy to oblige.
  • Always provide a few extra features and make sure the client knows they were provided as an extra feature (obviously get paid)
  • If you provide a discount, make sure the client knows he is getting something for free
  • Always take notes in a meeting, it will allow you to remember the client's ideas and concerns

In the end it is about perceived value about your work and keeping the customer happy.

Kloxo: park many domains at once

With google retiring parked domains, you may want to host a parking page yourself... which is easy to do yourself but if you have hundreds of domains, what do you do? Assuming Kloxo is installed (which preassumes centos 5), with the following script, you can add all of them in one go parked to a parent domain with the following script:
#!/bin/bash
if [[ $# < 1 ]]; then
  echo "Mass Domain Add via Kloxo"
  echo Usage:
  echo "addparked parentdomain.com <1perlinedomainfile.txt"
  echo addparked parentdomain.com
  exit
fi
while read line
do
   /script/add --parent-class=domain --parent-name=$1 --class=addondomain --name=$line--v-type=parked
done

Document & script as much as you can

When a server is installed, many commands are typed. When an application is installed, many config files are changed and though you may get a working machine slightly faster, it is good to document the install process as you go along for several reasons:
  • By documenting the process, you can always know what was done months ago. This is specially important in teams where the person who implemented the server is unavailable during an emergency
  • Because the process is documented, it is a lot easier to do the second time around. Not only do server crashes/ upgrades happen but who knows when you will need an additional server. Ideally, further than documenting the process, even allow for installation scripting... that will avoid errors when reinstalling in a rush.
  • In case of a problem, it covers your bases (client or boss forget they actually asked you to do something in the first place), your mind is at ease because you can go back and say why things were done the way they were.
  • It allows you to know how long things take to do things so you can more easily estimate future work
  • By writing it down, it actually commits the experience to memory better and you can reflect on what to do better the next time
Ideally, going one step further, scripting the install process is good for several reasons
  • Obviously, the speed is even greater
  • it avoids errors and omissions
  • there will be no hesitation if you need to reinstall the whole thing
  • If you ever get hacked (and I hope not), you're back up with a new machine almost instantaneously
As for the backup, automate it and have more than one copy
  • if the backup process is in a cron job, you will neither forget nor procrastinate, the computer does it on itself
  • by having more than one copy, if one goes bad then you can rely on any other copy. A standard practice is to have five copies of the data, to be able to revert to any previous day should there be a problem. Though it some cases it is not practical because the data does not changes or it is not practical.
  • With source code, use a revision control system such as GIT, CVS or subversion (does not matter which one, use one). if "it worked yesterday", you can always go back to yesterday's version

KISS Principle

A good way to be productive is the KISS principle (Keep ISimple & Sweet  or  Keep ISimple, Stupid depending on who you ask :) )

When you start a project it may appear tempting to implement all features with buzzwords bells and whitles, making everything perfect. Unfortunately, this may lead to tons of delays and end up with a very complex product that is not adapted to the real needs, especially if the product is at the proof of concept stage. 

The KISS principle says that it is good to have a working imperfect product deployed, get feedback and bring only the additional features. Structure your product into easy to master pieces and implement each one as simply as you can... you can always come back and improve on the pieces with missing functionality afterwards.