• Skip to primary navigation
  • Skip to main content
  • Skip to footer
ecm experts in australia

Seed Information Management

Enterprise Content Management Consultants ECM Experts

  • Services
    • Content Management
    • Collaboration
    • Business Process Management
    • Migration Services
    • Integration Services
    • Support & Maintenance
    • Pricing
  • Technology
    • Alfresco Digital Business Platform
    • Alfresco Application Development Framework
    • Cloud (PaaS)
    • Activiti
    • Flowable
    • Hyland RPA
  • Alfresco Addons
  • Case Studies
    • Industries
  • Blog
  • Contact Us
    • About

Alfresco Repository Clustering

October 23, 2013 By Seed IM

Alfresco Repository Clustering

Alfresco Clustering is necessary for high availability and also to allow your service architecture to build out horizontally. 

In this blog we explore an Alfresco Repository Clustering configuration which is used in 95% of Alfresco clustered installations, offering high availability features while keeping architectural complexity and duplication of information low.

Simple Alfresco Repository Cluster Architecture

 

As shown above, the Simple Alfresco Repository Cluster consists of the following;

·         A Load Balancer configured to use sticky sessions to load balance incoming requests between the Alfresco application servers. The Load balancer also provides an auto fail-over mechanism for whenever one node fails, the requests are sent to the next node in the cluster. In our case, Apache is used as a load balancer but IIS could also be used.

·         Two alfresco nodes allocated for the content platform tier application server. The alfresco nodes can be increased as required hence providing a very scalable application architecture. Alfresco Enterprise 4.1.6 was installed on Ubuntu 12.04 for application tier.

·         One shared database between the two nodes.

·         One shared filesystem between the two nodes

·         Local SOLR indexes kept for each node

Storage Tier

The Storage tier comprises of a database for metadata storage and a filesystem for content storage.  For our content storage, setting up a samba shared drive proved to be an easy way to have an accessible filesystem for our alfresco nodes.  For the alfresco database, MySQL was used.

Samba Shared Drive Set Up

Since we are using Ubuntu 12.04, the following commands may be Ubuntu specific but can be easily adapted for other linux flavours.

#sudo apt-get install samba  (Use yum install samba for redhat)

#mv /etc/samba/smb.conf /etc/samba/smb.conf.template

#vim /etc/samba/smb.conf

[global]

    ; General server settings

    ; Normally a dns should be used but in our situation ip address seems the way to go.

    netbios name = 10.0.0.19

    server string =

    workgroup = WORKGROUP

    announce version = 5.0

    socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE SO_RCVBUF=8192   SO_SNDBUF=8192

    passdb backend = tdbsam

    security = user

    null passwords = true

    username map = /etc/samba/smbusers

    name resolve order = hosts wins bcast

    wins support = yes

    printing = CUPS

    printcap name = CUPS

    syslog = 1

    syslog only = yes

[alfresco]

   comment = Alfresco files

   read only = no

   guest ok = no

 ; Path can be any location on the server

   path = /opt/alfrescoClusterData  

Adding Users who can access shared

#smbpasswd -a username

  put in password when prompted

#vim /etc/samba smbusers

  <username> = “root”

  <username> = “seed”

#service smbd restart

Note

These have to be correct user accounts that can log in the servers

Create Alfresco Cluster db

  1. CREATE DATABASE alfresco_cluster DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
  2. GRANT ALL PRIVILEGES ON alfresco_cluster.* TO alfresco_cluster@’%’ IDENTIFIED BY ‘alfresco_cluster’;
  3. GRANT SELECT,LOCK TABLES ON alfresco_cluster.* TO alfresco_cluster@’%’ IDENTIFIED BY ‘alfresco_cluster’;
  4. FLUSH PRIVILEGES;

Note

·         Make sure the following is commented in /etc/mysql/my.cnf

#bind-address            = 127.0.0.1

·         Instead of using ‘%’ in the above command, the ip addresses or dns can also be used.

Test DB and Shared Drive are accessible from the servers where alfresco would be set up

 

Test Database is accessible from alfresco servers.

#mysql -u alfresco_cluster -p -h 10.0.0.19

Mount Shared drive on both alfresco servers (Alfresco Node 1 and 2)

#mkdir /opt/alfrescoClusterData

#smbmount //10.0.0.19/alfresco /opt/alfrescoClusterData -o user=root password=seed

Note

 In order for shared drive to mount at startup the following entry needs to be added in fstab

//10.0.0.19/alfresco /opt/alfrescoClusterData smbfs users,rw,username=”, password=”, dmask=777, fmask=777 0 0

Installing Alfresco

Installing Alfresco Node 1

Install alfresco in the usual manner after creating a test db and using the local alf_data folder for the content store.

After ensuring that you have a clean log and can login Alfresco Share,

Change the following in alfresco-global.properties

dir.root=/opt/alfrescoClusterData/alf_data (Points to the mapped samba drive on the local alfresco server)

db.username=alfresco_cluster

db.password=alfresco_cluster

db.name=alfresco_cluster

db.url=jdbc:mysql://10.0.0.19:3306/alfresco_cluster?useUnicode=yes&characterEncoding=UTF-8

dir.keystore=/opt/alfresco-4.1.6/alf_data/keystore (Update accordingly depending on where the keystore folder is located)

Set SOLR to rebuilt the indices

  •        Delete content of archive SpacesStore at alf_data/solr/archive/SpacesStore/*
  •        Delete content of workspace SpacesStore at alf_data/solr/workspace/SpacesStore/*
  •        Delete cached content model data at alf_data/solr/archive-SpacesStore/alfrescoModels/*
  •        Delete cached content model data at alf_data/solr/workspace-SpacesStore/alfrescoModels/*

Restart Alfresco

Installing Alfresco Node 2

After making sure Alfresco Node 1 started properly, repeat the same steps as above for Alfresco Node 2.

 

Alfresco Cluster Settings

Now that we have our two alfresco nodes connected to a single database and content store, we need to configure both the Alfresco servers to participate in a cluster.

Alfresco uses JGroups for multicast communication between servers.  It allows sending the initial broadcast messages announcing a server’s availability. Additionally, JGroups manages the underlying communication channels, and cluster entry and exit. In order to initiate clustering, firstly, it is required to set the properties for the JGroups protocol so that it knows how to talk to the other Alfresco instance and secondly, configure the  L2 cache. The level 2 or L2 cache provides out-of-transaction caching of Java objects inside the Alfresco system. Alfresco provides support for EHCache. Using EHCache does not restrict the Alfresco system to any particular application server, so it is completely portable.

 

In order to initiate Alfresco Clustering, the following changes are required;

Alfresco Node 1 and Node 2

·         cp -r /opt/alfresco-4.1.6/tomcat/shared/classes/alfresco/extension/ehcache-custom.xml.sample.cluster /opt/alfresco-4.1.6/tomcat/shared/classes/alfresco/extension/ehcache-custom.xml

·         Comment out the following in /opt/alfresco-4.1.6/tomcat/shared/classes/alfresco/extension/ehcache-custom.xml

            <!– <cacheManagerPeerListenerFactory

            class=”net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory”

            properties=”socketTimeoutMillis=10000″

            /> –>

·         Uncomment  the following in /opt/alfresco-4.1.6/tomcat/shared/classes/alfresco/extension/ehcache-custom.xml

            <cacheManagerPeerListenerFactory

            class=”net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory”

            properties=”hostName=${alfresco.ehcache.rmi.hostname},

            port=${alfresco.ehcache.rmi.port},

            remoteObjectPort=${alfresco.ehcache.rmi.remoteObjectPort},

            socketTimeoutMillis=${alfresco.ehcache.rmi.socketTimeoutMillis}” />

 

# Add the following in alfresco-global.properties for Alfresco Node 1

###Cluster Configs Alfresco Node 1

alfresco.cluster.name=AlfrescoCluster

alfresco.jgroups.defaultProtocol=TCP

alfresco.tcp.start_port=7800

#Add the list of alfresco nodes here-dns or ip

alfresco.tcp.initial_hosts=10.0.0.10[7800],10.0.0.119[7800]

# ip or dns of local alfresco server

alfresco.ehcache.rmi.hostname=10.0.0.10

# Should be same as alfresco.ehcache.rmi.hostname

alfresco.rmi.services.external.host=10.0.0.10

alfresco.ehcache.rmi.port=40001

alfresco.ehcache.rmi.remoteObjectPort=45001

 

# Add the following in alfresco-global.properties for Alfresco Node 2

###Cluster Configs Alfresco Node 2

alfresco.cluster.name=AlfrescoCluster

alfresco.jgroups.defaultProtocol=TCP

alfresco.tcp.start_port=7800

#Add the list of alfresco nodes here-dns or ip

alfresco.tcp.initial_hosts=10.0.0.10[7800],10.0.0.119[7800]

#ip or dns of local alfresco server

alfresco.ehcache.rmi.hostname=10.0.0.119

# Should be same as alfresco.ehcache.rmi.hostname

alfresco.rmi.services.external.host=10.0.0.119

alfresco.ehcache.rmi.port=40001

alfresco.ehcache.rmi.remoteObjectPort=45001

#Restart Alfresco and check the logs to see if the Cluster with the name AlfrescoCluster (or whatever alfresco.cluster.name has been set to) has been started.

Note

·         Make sure the proper alfresco license is being used.

Testing the Alfresco clustering

1.      Login Alfresco Node 1 as admin and create a folder named ClusterFolder1

2.      Login Alfresco Node 2 as admin, check that the folder ClusterFolder1 can be viewed and create a folder named ClusterFolder2.

3.      Login Alfresco Node 1 again as admin and check that the folder ClusterFolder2 can be viewed.

Load Balancer Set Up

Apache can be used to load balance incoming requests between the Alfresco application servers. Normally the load balancer should be set up in its own server but for the sake of convenience, it is set up on our storage server and the commands used below are Ubuntu specific hence needs to be modified accordingly for other Linux flavours.

#Install Apache using apt-get install apache2 (use yum install httpd for redhat)

Test by going to http://localhost

Apache connects to Alfresco on Tomcat via the mod_proxy_ajp module and the integrated software load balancer (mod_proxy_balancer) module.

mod_proxy_ajp and mod_proxy_balancer are installed by default when installing apache but need to be enabled.

#a2enmod proxy proxy_ajp proxy_balancer

For redhat add the following to httpd.conf

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so

 

#mv /etc/apache/sites-available/default /etc/apache/sites-available/default-original

#create a new default file and add in the following;

For redhat add the following to httpd.conf

<VirtualHost *:80>

                ProxyRequests off

                ServerName 10.0.0.19   (Normally a proper DNS should be used)

                DocumentRoot /var/www

                <Directory />

                                Options FollowSymLinks

                                AllowOverride None

                </Directory>

                <Directory /var/www/>

                                Options Indexes FollowSymLinks MultiViews

                                AllowOverride None

                                Order allow,deny

                                allow from all

                </Directory>

                 ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/

                <Directory “/usr/lib/cgi-bin”>

                                AllowOverride None

                                Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch

                                Order allow,deny

                                Allow from all

                </Directory>

                 ErrorLog ${APACHE_LOG_DIR}/error.log

                 # Possible values include: debug, info, notice, warn, error, crit,

                # alert, emerg.

                LogLevel warn

                 CustomLog ${APACHE_LOG_DIR}/access.log combined    

        <Proxy balancer://alfresco-cluster>

                # alfresco node1

                BalancerMember ajp://10.0.0.10:8009 min=10 max=100 route=node1 loadfactor=1

                # alfresco node2

                BalancerMember ajp://10.0.0.119:8009 min=20 max=200 route=node2 loadfactor=2

                 # Security “technically we aren’t blocking

                # anyone but this the place to make those

                # changes

                Order Deny,Allow

                Deny from none

                Allow from all

                 # Load Balancer Settings

                # We will be configuring a simple Round

                # Robin style load balancer.  This means

                # that all alfresco nodes take an equal share of

                # of the load.

                #ProxySet lbmethod=byrequests

                ProxySet stickysession=JSESSIONID

         </Proxy>

         # balancer-manager

        # This tool is built into the mod_proxy_balancer

        # module and will allow you to do some simple

        # modifications to the balanced group via a gui

        # web interface.

        <Location /balancer-manager>

                SetHandler balancer-manager

                 # I recommend locking this one down to your

                # your office

                Order deny,allow

                Allow from all

        </Location>

         # Point of Balance

        # This setting will allow to explicitly name the

        # the location in the site that we want to be

        # balanced, in this example we will balance “/”

        # or everything in the site.

        ProxyPass /balancer-manager !  

 ProxyPass /alfresco balancer://alfresco-cluster/alfresco

 ProxyPass /share balancer://alfresco-cluster/share

 </VirtualHost>

Tomcat adds the name of the Tomcat instance to the end of its session id cookie (i.e. JSESSIONID), separated with a dot (.) from the session id. Thus if the Apache web server finds a dot in the value of the sticky cookie, it only uses the part behind the dot to search for the route.

In order to let Tomcat know about its instance name, we need to set the attribute jvmRoute inside the Tomcat configuration file conf/server.xml to the value of the route of the BalancerMember that connects to the respective Tomcat

#vim /opt/alfresco-4.1.6/tomcat/conf/server.xml

Alfresco Node 1

<Engine name=”Catalina” defaultHost=”localhost” jvmRoute=”node1″>

Alfresco Node 2

<Engine name=”Catalina” defaultHost=”localhost” jvmRoute=”node2″>

 

Alfresco can now be accessed with the http://10.0.0.19/share URL and should redirect to one of the Alfresco Tomcat servers.

Note

Uncomment the following in tomcat/conf/server.xml and check the localhost_access.log file in tomcat/logs to see which alfresco node is being redirected to.

<Valve className=”org.apache.catalina.valves.AccessLogValve” directory=”logs” 

               prefix=”localhost_access_log.” suffix=”.txt” pattern=”common”               resolveHosts=”false”/>

 

This concludes our blog on clustering the Alfresco application and we hope it gives a good point of start for venturing into alfresco clustering.

Reference

http://docs.alfresco.com/4.1/index.jsp?topic=%2Fcom.alfresco.enterprise.doc%2Fconcepts%2Fha-intro.html

http://www.ixxus.com/blog/2012/01/getting-started-setting-alfresco-cluster

http://www.cignex.com/articles/alfresco-cluster-configuration

 

Footer


Seed IM is a leading ECM consulting company providing powerful solutions to businesses and organisations of all sizes

Contact Us

  • Seed Information Management Pty Ltd
    90 Maribyrnong Street
    Footscray VIC 3011
  • 03 9021 0837
  • info@seedim.com.au

Articles

Semantic Content Management for Alfresco
Using Multiple Taxonomies To Browse Your Content
Records Management Using Alfresco One

Copyright © 2025 Seed Information Management Pty Ltd · Contact Us · Privacy Policy · Our Digital Media Agency is Marmoset Digital