AI should not perfect the human voice

Been thinking that machines must not perfect the human voice. That’s because racial and gender discrimination has hurt our generations for a very long time, so why make digital assistants such as Siri or Alexa speak in either a feminist, a masculine, a particular dialect, or in a certain voice? While we constantly interact with machines, would it make sense to discriminate by voice. I think it would be best that machines maintain a pure gender/accent-neutral voice that does not sound male or female. Make it neutral. That way we don’t add to the pain of discrimination that human races keep suffering all the time. One universal voice for AI that speaks all (or most) human languages without preferring one tone or dialect over the other. What do you all think?

AI/ML supporting hometowns of international students: what, how and why?

Nearly everyone around by now has either heard or used artificial intelligence (AI) and machine learning (ML) in some form or fashion. Some students are already publishing papers in the field while other students are applying various AI techniques in their research, internships, or just for fun. Professionals in the industry have either incorporated some form of AI/ML into their product or services or are currently considering it. Either way, AI and ML have a lot to offer but not without a good amount data, significant processing power, right skillsets, and a lot of patience with design and execution of such projects. For that problem, AutoML is a promising new technique in the field that allows researchers and professionals to make use of pre-trained models and cloud-based services to roll out AI solutions much more rapidly than building machine learning models from scratch. AutoML provides the methods and processes to apply, integrate, deploy, and scale machine learning intelligence without requiring expert knowledge. Major AI platforms, starting with Google and followed by Microsoft, H20.ai, and others are priming AutoML as the next evolutionary frontier in artificial intelligence so that humans can spend zero time recreating machine learning models from scratch, and, instead, focus on applying the models while letting machines take care of building them.

References: -PwC 2017 report “PwC’s Global Artificial Intelligence Study: Exploiting the AI Revolution” https://www.pwc.com/gx/en/issues/data-and-analytics/publications/artificial-intelligence-study.html

-National Foundation for American Policy (October 2017) “The Importance of International Students to American Science and Engineering” http://nfap.com/wp-content/uploads/2017/10/The-Importance-of-International-Students.NFAP-Policy-Brief.October-20171.pdf

English on non-English NLP and machine learning projects

Whenever I ask a bilingual (English + another language) students or professionals working on machine learning or artificial intelligence if they considered doing a project in AI for their nonEnglish mother tongues language, such as Hindi or Spanish or Arabic, they look at me puzzled and surprised. Yes, there are a lot of publications on all sorts of languages but how often do you see innovative products in the market for non-English customers even in English-speaking nations? The US has a huge immigration population and houses neighborhoods that don’t even speak English. Why not develop more intelligent products with the aid of deep learning that targets nonEnglish recipients and not just comes up with another translation software every time? We need to think beyond the status-quo of research, products, software, and publications that are predominately English. It is challenging; I admit because it is so easy to code in English with an English programming language syntax, editor, OS, GUI and it is also hard to find nonEnglish corpus. Mandarin is an exception in all this here. It is not impossible to do more for nonEnglish speaking societies.

New Arabic NLP findings – An Arabic speech corpus and an Arabic Root Finder neural net

Published a new blog post on Arabic.Computer about two projects I found on the Internet that are useful for Arabic NLP initiatives. The first one is Arabic Speech Corpus in a Damascian accent provided by Nawar Halabi and is offered under a non-commercial license. The other one is Arabic Root Finder, a useful Keras/Scikit-Learn neural network for finding Arabic word roots and is offered by Tyler Boyd under a GPL-3 license.

Links and more information are available at the blog post.

Arabic.Computer

I started a new site http://arabic.computer to share and store any material that is related to the Arabic language and technology. For the time being, I will just post material that I encounter whenever I happen to be reading or researching online. Later on, I will expand more on it and eventually hope to drive our Arab community of engineers, researchers, and technologists to share any Arabic-related work.

Build with Watson

Last night I played around with IBM Watson Internet of Things and Bluemix cloud services platform using their developer tools. IBM changed their developer pricing policy to allow free light accounts not to expire which is a great for unlimited prototyping. I experimented with training Watson conversations dialogues and natural language understanding using their visual aid tool and the github sample codes as part of their machine learning development platform and later ran an IOT simulator using a sample NodeJs app uploaded to their cloud using cloud foundry as part of their IOT development platform. A feature that I liked but didn’t play with it yet was IBM recipes for IoT using NodeRed, a flow based programming tool that was originally created by IBM engineers but is now part of JS Foundation. I plan to a run a real prototype with my Raspberry PIs and Arduinos soon and potentially extend it with Watson as well for an office prototype project.

Check IBM Watson developer program and IBM Watson for IOT

Reverse XML – my lessons learned for millennial sake

In April of 2001, I was so much into XML and had some innovative ideas of my own. I was working as a software engineer at Knowledgeview. The company was heavily focused on developing content syndication software for news agencies and newspapers. At that time the language Perl was still dominant for parsing text and Java was the popular language for websites. Knowledgeview was adamant about using standard specifications including NewsML, NITF, RSS, and more. XML, initially defined in 1998 wiki was starting to become a topic discussion within the office during the time that I was working with them between 1999 and 2001. As a young programmer, working mainly from the Lebanon office, I had lesser exposure to the technologies that the London office had at the time. But that did not stop me from trying to innovate. On April 23, 2001, I send an email to the xml mail distro at Knowledgeview that said:

Date: Mon, 23 Apr 2001 11:39:03 +0100

Attached is a 4-page white paper about a concept that crossed my mind like week. The whole idea initially started after a brief conversation with Dr Ali about XML in which he mentioned that not all companies may integrate XML in their applications. Such a remark, made me think of solutions that would keep one form of framework for such companies to exchange their data (that are based on customized and different structuring) without resorting to applications’ modifications (expensive) but where standards (like any form of XML) still apply.

What I basically said in the document is that if the industry is heading towards XML as a standard form of communication across systems or applications, and if some companies many not be quick to jump onto XML, why not generate an orchestration mechanism to allow a company A and a company B share data by bridging each of their own custom formats with a help of a middle-player – a two translator. The steps would be as follows

  1. let each company declare its set of set delimiters D for its content and publish the format on a common repository
  2. define a set of XSL rules that convert each set of delimiters D from (1) into a universal XML X format.
  3. anytime a company Y would like to leverage date from company X, company Y would query the common repository for company X data specification and execute the set of rules in 2 to convert the text from Company X into the format needed for Company Y.

I named the technique Reverse XML

I did not hear from anyone in the company about my idea. I was 27 at the time and was still young in the industry. I did not push myself nor I knew any better way to articulate my idea other than just emailing. Few month later I left the company not because of this but because I decided to moved the United States and a build a new future with my wife.

Why am I saying all this? I thought I had a great idea at the same time. Given that I had limited resources I really did not know that there might have been a similar product out there. Maybe if I know at that time what I know now I could have been more aggressive in marketing my idea. I would also ask the question why would I have sent a subsequent email and say that some competitor product exist when I really did not try that product. Whatever my idea was truly unique. Moreover, even if I did not hear from my management I should have tried another way. Nevertheless, I was proud of my idea and the name that I gave it – Reverse XML. I tried to be imaginative and was thinking big. In my conclusion I wrote:

The application may be established as a free service where the following could be our revenue:

  1. Our database would include all companies’ source formats, where our content-representation language that should handle all of these formats may create a data bridge between all those companies whose applications are not XML-friendly yet.
  1. Having said point “1”, our database would be also valuable because we will be able to market-focus our products that maybe of interest to these companies.

Note: “Reverse XML” may be free to use and our company may provide as a paid service the option to write the client’s “content key” files.

I was thinking of open sourcing the solution but provide a paid service for assisting companies.

Who knows maybe it would have been a great business opportunity or a great success story. Maybe this idea might have turned big just like JSON format nowadays. What if I patented the idea or made something more of it? There is no shortage of one’s tendency to dream and think of great accomplishments. Why not? Unfortunately I did not push for it and, at the same time, I could not convey its value in a better presentable fashion.

I later received a call from management but eventually the idea was not really understood nor was accepted. I still believe that, at the time, this idea could have had a great potential. But that does not matter. What matters is not to give up on your ideas. Push for them. That said, if you are in the late twenties early thirties or any age actually, if you have a great idea, push for it with your heart and soul. It might win big, and if it does not, learn from your mistakes to do something even better the next time. Don’t just wait for someone to call you… be proactive and make the call – not once but more.

note: I have no intention to say anything negative about my past employer. This post is only meant to illustrate the point that if you feel strong about your idea, push harder and don’t just accept the status quo.

You can read my original document here here.