Saturday, August 7, 2010

Translating Sitecore items using Google Translate

One of the strongest techniques which can be used to increase sales and get more potential customers of the people from other regions is a multilingual website. Multi-language management is a cornerstone of Sitecore CMS, integrated into all aspects of content and site management. It allows you to build a multi-language website with almost the same effort as a single-language. See this amazing case-study about building website in 28 languages.

But what if you want to translate an existing site with a thousands of pages, hundreds of templates and website sections? How to identify possible architecture(like "Shared" fields that actually should be translated) problems, or implementation bugs (hard-coded texts, displaying item name instead of a title and so on). Sure, you can ask content-editors to spend few weeks in order to translate the site, test it, etc. But how about instant website translation using some online service? Translation quality is not as important when you simply need to identify possible problems after translation, so... let's start!

In this example, I'll use Google Translate as a translation provider. I also tried Bing Translator service, it's almost identical, but does not provide easy API like google-api-for-dotnet.


First, we need to define required commands that will be called from the Sitecore back-end. Two commands will be enough for the start

namespace Sitecore.Custom.Translation.Commands
{
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using Sitecore.Shell.Framework.Commands;
    using Sitecore.Data.Fields;
    using Sitecore.Data.Items;
    using Sitecore.Diagnostics;
    using Sitecore.Shell.Applications.Dialogs.ProgressBoxes;
    using Sitecore.Globalization;
    using Sitecore.Jobs;

    class TranslateItemCommand : Command
    {
    }
}

namespace Sitecore.Custom.Translation.Commands
{
    using Sitecore.Data.Items;

    class TranslateTreeCommand : TranslateItemCommand
    {
        protected override Item TranslateItem(Item item)
        {
            var translationService = new GoogleTranslateService(SourceLanguage, item.Language.CultureInfo.TwoLetterISOLanguageName);

            TranslateItem(item, translationService);
            var items = item.Axes.GetDescendants();
            foreach (var childItem in items)
            {
                TranslateItem(childItem, translationService);
            }

            return item;
        }
    }
} 

Let's start extending the TranslateItemCommand step by step.

As we don't need to translate fields from Standard Template as well as some fields types like DropList, CheckBox, etc., the following methods will be useful:

private bool FieldIsTranslatable(Field field)
        {
            return !((field.TypeKey == "image") ||
                    (field.TypeKey == "reference") ||
                    (field.TypeKey == "general link") ||
                    (field.TypeKey == "datetime") ||
                    (field.TypeKey == "droplink") ||
                    (field.TypeKey == "droplist") ||
                    (field.TypeKey == "treelist") ||
                    (field.TypeKey == "droptree") ||
                    (field.TypeKey == "multilist") ||
                    (field.TypeKey == "checklist") ||
                    (field.TypeKey == "treelistex") ||
                    (field.TypeKey == "checkbox"));
        }

        private bool FieldIsStandard(Field field)
        {
            return field.Definition.Template.Name == "Advanced" ||
                   field.Definition.Template.Name == "Appearance" ||
                   field.Definition.Template.Name == "Help" ||
                   field.Definition.Template.Name == "Layout" ||
                   field.Definition.Template.Name == "Lifetime" ||
                   field.Definition.Template.Name == "Insert Options" ||
                   field.Definition.Template.Name == "Publishing" ||
                   field.Definition.Template.Name == "Security" ||
                   field.Definition.Template.Name == "Statistics" ||
                   field.Definition.Template.Name == "Tasks" ||
                   field.Definition.Template.Name == "Validators" ||
                   field.Definition.Template.Name == "Workflow";
        }
Now, we need to define Command class methods and some helpers:

protected const string SourceLanguage = "en";
        protected const int MaxServiceRequestLength = 1500;

        public override void Execute(CommandContext context)
        {
            Assert.ArgumentNotNull(context, "context");
            Item item = context.Items[0];
            ProgressBox.Execute("ItemSync", "Translate", this.GetIcon(context, string.Empty), new ProgressBoxMethod(this.TranslateItem), "item:load(id=" + item.ID + ")", new object[] { item, context });
        }

        private void TranslateItem(params object[] parameters)
        {
            CommandContext context = parameters[1] as CommandContext;
            if (context != null)
            {
                Item item = parameters[0] as Item;
                if (item != null)
                {
                    this.TranslateItem(item);
                }
            }
        }

        protected virtual Item TranslateItem(Item item)
        {
            var translationService = new GoogleTranslateService(SourceLanguage, item.Language.CultureInfo.TwoLetterISOLanguageName);
             
            TranslateItem(item, translationService);
            return item;
        }

        private IEnumerable<string> SplitText(string text, int numberOfSymbols)
        {
            int offset = 0;
            List<string> lines = new List<string>();
            while (offset < text.Length)
            {
                int index = text.LastIndexOf(" ",
                                 Math.Min(text.Length, offset + numberOfSymbols));
                string line = text.Substring(offset,
                    (index - offset <= 0 ? text.Length : index) - offset);
                offset += line.Length + 1;
                lines.Add(line);
            }

            return lines;
        }

And finally, the main translation method. It will translate the item from "en" language(see the SourceLanguage constant) and translate it to the language version selected in back-end .

public void TranslateItem(Item item, ITranslationService service)
        {
            var sourceItem = Sitecore.Context.ContentDatabase.GetItem(item.ID, Sitecore.Globalization.Language.Parse(SourceLanguage));

            Job job = Context.Job;
            if (job != null)
            {
                job.Status.LogInfo(Translate.Text("Translating item by path {0}.", new object[] { item.Paths.FullPath }));
            }

            if (item.Versions.Count == 0)
            {
                if (sourceItem == null)
                {
                    return;
                }

                item = item.Versions.AddVersion();
                item.Editing.BeginEdit();

                foreach (Field field in sourceItem.Fields)
                {
                    if ((string.IsNullOrEmpty(sourceItem[field.Name]) || field.Shared))
                    {
                        continue;
                    }

                    if (!FieldIsTranslatable(field) || FieldIsStandard(field))
                    {
                        item[field.Name] = sourceItem[field.Name];
                    }
                    else
                    {
                        var text = sourceItem[field.Name];
                        var translatedText = string.Empty;

                        if (text.Length < MaxServiceRequestLength)
                        {
                            item[field.Name] = service.Translate(text);
                            continue;
                        }

                        foreach (var textBlock in SplitText(text, MaxServiceRequestLength))
                        {
                            translatedText += service.Translate(textBlock);
                        }

                        item[field.Name] = translatedText;
                    }
                }

                item.Editing.EndEdit();
            }
        }
Now, when the coding is finished, let's add command to the back-end and start using it:

1) Add the following text to the "Command.config" file located in the "App_Config" folder
<command name="contenteditor:translatetree" type="Sitecore.Custom.Translation.Commands.TranslateTreeCommand,Sitecore.Custom.Translation" />
<command name="contenteditor:translateitem" type="Sitecore.Custom.Translation.Commands.TranslateItemCommand,Sitecore.Custom.Translation" />

2) Add a new command to the core database here: /sitecore/content/Applications/Content Editor/Ribbons/Chunks/Translate

Here is the result:




To translate an item, navigate to it, select a language you want to translate it to:


















And press "Translate Tree". Translation time depends on number of items and their fields and may take a while.










In a few clicks items get translated to the specified languages










And you can proceed with website verification



























I hope to find some time to package all this stuff and and add it to the Shared Source library. For now, I've attached the source code, please let me know if something is missing. Thanks!