Sunday, March 14, 2021

Installing Java 8 on MacOS

If you, like me, need to install Java 8 on your MacOS, then this is the TL;DR:

brew install openjdk@8

You will find more complete instructions in mykong's excellent overview of How to Install Java on Mac OS X

I needed Java8 in order to play with MorganaXProc-III, a Java XProc 3 implementation. At the time of writing, that software only supports Java 8, which set me hunting for how to install Java 8 on my Mac.

Hopefully, this short note will also help you (or future me).

Wednesday, December 9, 2020

If You're a Student or Recent Graduate, I Can Help you Get an Internship or Job at Amazon

Amazon has a lot of student opportunities, including internships. If you're a recent graduate, and so interested in a full time position, then checkout the link.

Amazon is pretty transparent about how it interviews people. These videos are a good overview for SDEs. They are made by actual Amazon employees (the third one is by my colleague, Samir):

I didn't interview as an intern, but I did look up what Glassdoor had to say about the process and it was accurate about what I had to do. So these descriptions about process and questions are probably accurate,6_KO7,13.htm

Right now, we are only interviewing and working remotely (until at least June 2021).

Good luck!

Wednesday, February 5, 2020

Working at Amazon

Since August 2019, I've been working at Amazon Web Services. Six months in, and it has been a really great experience!

Can I help you get a job at Amazon? Yes, I can! I suggest looking at the roles available at Amazon Jobs. Pick out one or two and then contact me, before you apply. I can help you understand how those jobs fit within the organization. And maybe help you identify other roles which might also be a good (or a better) fit for you. I can refer you internally, if you like, which can help get you in front of the hiring manager. And I can give you tips on the interview process, including the importance of the Amazon Leadership Principles. If things progress well, I can even give you some advice on negotiating your compensation. (Amazon's compensation package is quite different from that at other companies).

There is a ton of opportunity at Amazon. Some people love it, others hate it. I love it :)

Tuesday, April 16, 2019

Standardized Rights Statements for News

At the IPTC Spring Meeting in Lisbon, I proposed IPTC Rights Statements For News, an approach inspired by

This approach would support both efficient filtering of content and sophisticated evaluation of restrictions. It is simple, flexible, accurate, descriptive and transparent.

You can read the full proposal at but, in a nutshell, the idea is:
1. Create a standard set of rights statements specific to news and media
2. Express each rights statement as a URL, which can be embedded in content and therefore is suitable for filtering
3. Enable each rights statement URL to be dereferencable, meaning it can be evaluated by machines or by people

I think that this approach would work well for expressing rights and restrictions for news and media. Having other industry players work with rights in a compatible way would help with adoption. If  customers are getting rights statements in the same way from several publishers, it is more likely that CMS and MAM vendors will implement the necessary support.

If you're interested in finding out more about this or other IPTC initiatives, feel free to get in touch

Thursday, October 25, 2018

Rights, Classification and Search Relevance for News - A Short Wrap Up of the IPTC Toronto Meeting

Last week, the IPTC held its Autumn meeting, a chance for people from around the world who have a common interest in news and media technology to discuss standards and learn from each other.

We had an entire day dedicated to news search, classification and descriptive metadata. AP's own Chad Schorr discussed our use of Elastic for robust indexing of content, Veronika Zielinska discussed AP's rules-based automated news classification system, and I reviewed our new automated tagging for images. We also heard from our peers at Bloomberg, New York Times, DPA and NTB on their systems. Finally, we had the opportunity to hear directly from Elastic on their suggestions for the best way to use their tools for news and media content.

A fairly momentous event for IPTC this year was Google Image's agreement to display image credit metadata for photos. This follows many years of discussions between IPTC, CEPIC and many others with the search giant. These talks came to a head during the CEPIC Congress / IPTC Photo Metadata Conference in Berlin in May 2018 and I'm very glad to see that concrete changes to Google Images use of metadata followed swiftly after. During the discussion of IPTC's Rights work for the news and media industry, we discussed ways to build on this progress - centred on driving adoption of the RightsML standard for machine processable expressions of rights and restrictions. We also discussed ways that IPTC could cooperate with other organizations, such as Europeana, to drive adoption of rights metadata.

As with other IPTC face-to-face meetings, there were many other interesting presentations and discussions, including the latest developments in video metadata, sports data and hearing from Civil on their plans for blockchain-backed journalism. For a more complete overview, checkout IPTC's posts on Day 1 and Day 2. And consider joining IPTC's next face-to-face meetings in Lisbon and Paris.

Monday, October 22, 2018

The View from Toronto - IPTC Chairman's Report 2018

I Chair the Board of Directors of IPTC, a consortium of news agencies, publishers and system vendors, which develops and maintains technical standards for news, including NewsML-G2, rNews and News-in-JSON. I work with the Board to broaden adoption of IPTC standards, to maximize information sharing between members and to organize successful face-to-face meetings.

We hold face-to-face meetings in several locations throughout the year, although, most of the detailed work of the IPTC is now conducted via teleconferences and email discussions. Our Annual General Meeting for 2018 was held in Toronto in October. As well as being the time for formal votes and elections, the AGM is a chance for the IPTC to look back over the last year and to look ahead about what is in store. What follows are a slightly edited version of my remarks at the Toronto Annual General Meeting.

IPTC has had a good year – the 53rd year for the organization!

We've updated key standards, including NewsML-G2, the Video Metadata Hub and the Media Topics, as well as launching RightsML 2.0, a significant upgrade in the way to express machine processable rights for news and media.

Of course, IPTC standards are a means, not an end. The value of the standards is the easier exchange, consumption and handling of news and media by organizations large and small around the world. So it is important that we continue to focus on making our standards straightforward to use and have them adopted as widely as possible. I think we are making progress on the usability front, such as moving away from zip'd PDFs towards actual HTML web pages for documentation of NewsML-G2. Over the last year, we've continued to work with other organizations - W3C, Europeana and MINDS - to develop standards, increase adoption - and, perhaps most importantly, to open up IPTC to other perspectives. And we have had a huge win in the recognition of key photo metadata by Google Images. But we clearly need to do more for both usability and adoption. During the course of this meeting, we've had some good discussion about what more we can do in both areas and I encourage all members to help spread the word about IPTC standards, and suggest ways we can accelerate adoption.

Of course, the nature of news and media continues to evolve. On the one hand, new forms of story telling are emerging, such as Augmented Reality and Virtual Reality. Equally, using data as the way to power stories continues to increase both data-driven stories and data-supported stories. By data-driven stories, I mean journalists reviewing large databases of information and creating stories based on the trends they find. By data-supported stories, I mean content creators using visually-interesting graphics to support their content. The automated production, curation and consumption of news and media is likely to increase for the foreseeable future, driven by both technological improvements and the seductive economics of replacing people with algorithms. And it is not only economics which are driving these changes and challenges, just as it is no longer fill-in-the-blank text stories being written by robot journalists. Synthetic media - such as "deep fakes" - are able to produce increasingly convincing photo, video and audio stories that are indistinguishable from "real" media. Inevitably, the existence and debunking of these fakes will be used to deny legitimate reporting, with the implications of continued erosion of trust in media. All of these trends - AR, VR, data-powered journalism and dealing with trust, credibility and misinformation - are topics which IPTC has discussed over the last few years, but we have not developed any tracks of work to try to address them. In part, this is because these are, by definition, outside of the areas that our member organizations traditionally deal in and are so quite difficult to tackle in terms of establishing standards.

However, even within the context of standards, IPTC is opening up to new forms of experimentation. As we heard on Monday, the joint project between IPTC and MINDS, to allow for the identification of audience and interest metadata, has lead to the introduction of structures within NewsML-G2 to support rapid prototyping and experimentation. I see this as a positive move, with great potential to accelerate the work we do and to help keep it lightweight and relevant.

Of course, IPTC has had significant changes of its own over the last year. We bid goodbye to Michael Steidl as our Managing Director of 15 years, and welcomed Brendan Quinn as our new Managing Director this summer. We're grateful that we continue to benefit from Michael's skills and experience, as he has remained the Chairman of the Photo and Video Working Group. And I think that Brendan has made a great start in his new role in helping us keep the IPTC moving forward.

As part of the handover from Michael to Brendan, we decided to scan a lot of the old paper documents, including various types of IPTC newsletter, dating back to 1967, two years after the organization was founded. I thought I would look back to what IPTC was up to in the year 2000, the year I became a delegate to the IPTC, back when I worked for Dow Jones.

And there I am in the photo at the top of the page. Or, at least, the back of my head. Some things are quite reminiscent of this week's meeting - the birth of NewsML, a focus on improved communications, cooperation with other organizations e.g. MPEG-7.

Then I thought I would look back on IPTC in 1968, the year I was born. Some things were similar to today - such as a focus on fine technical details such as Alphabet Number 5 and a plan to go to Lisbon next year for a meeting. However, most of the focus in those days was mainly on lobbying against tariffs and satellite monopolies.

So I think it is fair to say that the IPTC has never been just a standards body. It is also, more broadly, a community of practice. We are a group of people from around the world who have a common interest in news and media technology. The process of sharing information and experiences with the group, through these face to face meetings and the online development of standards, means that the members of IPTC learn from each other, and so have an opportunity to develop professionally and personally. I hope you will agree that yesterday's discussion of news search and classification was an excellent example of exchange of experiences, both good and bad, which can help many of us avoid problems and seize opportunities, and so accelerate our work.

I think it is helpful for us to recognize that IPTC is a community which continues to evolve, as the interests, goals and membership of the organization change.  I’m confident that – working together – we can continue to reshape the IPTC to better meet the needs of the membership and to move us further forward in support of solving the business and editorial needs of the news and media industry. I look forward to working with all of you on addressing the challenges in 2019 and beyond.

Wednesday, May 9, 2018

News Credibility, Verification and the Madness of Crowds - A Junk News Roundup

As the Associated Press states in our News Values and Principles:

"We have a long-standing role setting the industry standard for ethics in journalism. It is our job — more than ever before — to report the news accurately and honestly."

It is easy to see how AP is taking concrete steps in this area by, for example, our fact checking work (online, on twitter). And the AP Verify project is building a "newsroom tool that will combine artificial intelligence with our editorial expertise to automatically source and verify user-generated content."

I thought it would be interesting to take a look at some efforts going on elsewhere in the areas of credibility, verification and identifying junk news.

Standards Efforts

The IEEE is working on P7011 "Standard for the Process of Identifying and Rating the Trustworthiness of News Sources". The IEEE is a formal standards body, responsible for many of the technical standards which underpin the internet.

The Credibility Coalition describes itself as "an interdisciplinary community committed to improving our information ecosystems and media literacy through transparent and collaborative exploration." It is not, in itself, a standards body. If you examine the CredCo "about" page, you will spot my photo - I attended early meetings.

The Credible Web W3C Community Group describes its mission as "to help shift the Web toward more trustworthy content without increasing censorship or social division." There is a significant overlap between members of the Credibility Coalition and the Credible Web Community Group.  Despite the W3C link, this is not a formal standards effort - Community Groups are open to anyone. There are weekly video conferences to define an informal standard.

The Trust Project describes itself as "a consortium of top news companies" and says it "is developing transparency standards that help you easily assess the quality and credibility of journalism." Again, the Trust Project is not a formal standards body (like IEEE, IPTC or W3C).

Verification Projects

At the recent IPTC meeting, we saw presentations about two European projects aimed at helped to identify the spread of misinformation.

Truly Media is a joint project between ATC and Deutsche Welle. It is a "a web-based collaboration platform developed to support primarily journalists and human rights workers in the verification of digital content," and was developed with funds from EU and the DNI.

InVid aims to develop "a knowledge verification platform to detect emerging stories and assess the reliability of newsworthy video files and content spread via social media." It is an EU-funded project. Their demo was quite sophisticated. They also have a browser plugin which lets you verify news video and images yourself.

Wisdom and Madness

Finally, via Fair Warning, I saw "The Wisdom and Madness of Crowds" - a fun explainer in the form of a game. It walks you through why some crowds turn to madness and some to wisdom, with a focus on the spread of misinformation but also good information. It helps give some insight into the different dynamics at play and even some suggestions for how to reduce the spread of junk news and amplify the spread of verified news.