Initial pass at automatic hi-res NYPL Stereogranimator morphing

author: Ryan Baumann <ryan.baumann@gmail.com> 2016-08-23 16:14:48 -0400
committer: Ryan Baumann <ryan.baumann@gmail.com> 2016-08-23 16:14:48 -0400
commit: 48b73c35c312d9a00d94f34bffc7d5e1a8f10904 (patch)
tree: 696ce996ed06ea5cd4532e4df476806911b76200
parent: acb4c878bde601de3c792fed38198ad03f69b21e (diff)
7 files changed, 216 insertions, 2 deletions
diff --git a/Dockerfile b/Dockerfile
index 23db38a..e554117 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,7 +1,9 @@
 FROM teeps/cuda7.5-art-vid
 MAINTAINER Ryan Baumann <ryan.baumann@gmail.com>
 
-RUN apt-get install -y bc
+RUN apt-get install -y bc python-opencv
+RUN apt-add-repository ppa:brightbox/ruby-ng && apt-get update && apt-get install -y ruby2.2 ruby2.2-dev
+RUN gem install bundler
 
 ADD . /root/torch-warp
 
@@ -9,4 +11,5 @@ RUN cp -v *-static /root/torch-warp/
 
 WORKDIR /root/torch-warp
 
+RUN bundle install
 RUN cd consistencyChecker && make
diff --git a/Gemfile b/Gemfile
new file mode 100644
index 0000000..64ac3ac
--- /dev/null
+++ b/Gemfile
@@ -0,0 +1,3 @@
+source 'https://rubygems.org'
+gem 'rest-client'
+gem 'dimensions'
diff --git a/Gemfile.lock b/Gemfile.lock
new file mode 100644
index 0000000..2edfb89
--- /dev/null
+++ b/Gemfile.lock
@@ -0,0 +1,29 @@
+GEM
+  remote: https://rubygems.org/
+  specs:
+    dimensions (1.3.0)
+    domain_name (0.5.20160615)
+      unf (>= 0.0.5, < 1.0.0)
+    http-cookie (1.0.2)
+      domain_name (~> 0.5)
+    mime-types (3.1)
+      mime-types-data (~> 3.2015)
+    mime-types-data (3.2016.0521)
+    netrc (0.11.0)
+    rest-client (2.0.0)
+      http-cookie (>= 1.0.2, < 2.0)
+      mime-types (>= 1.16, < 4.0)
+      netrc (~> 0.8)
+    unf (0.1.4)
+      unf_ext
+    unf_ext (0.0.7.2)
+
+PLATFORMS
+  ruby
+
+DEPENDENCIES
+  dimensions
+  rest-client
+
+BUNDLED WITH
+   1.12.5
diff --git a/README.md b/README.md
index 028a7a2..de85c5c 100644
--- a/README.md
+++ b/README.md
@@ -17,7 +17,9 @@ This process is inspired by Patrick Feaster's post on [Animating Historical Phot
 
 For input, you need two PNG images of the same dimensions named e.g. `filename_0.png` and `filename_1.png`. You can then run `./run-torchwarp.sh filename` to run all the steps and output the morphing animation as `morphed_filename.gif`.
 
-You can also use `./run-stereogranimator.sh ID` with an image ID from [NYPL's Stereogranimator](http://stereo.nypl.org/) to download an animated GIF and run it through the morphing process.
+You can also use `./run-stereogranimator.sh ID` with an image ID from [NYPL's Stereogranimator](http://stereo.nypl.org/) to download an animated GIF at low resolution and run it through the morphing process.
+
+If you sign up for [the NYPL Digital Collections API](http://api.repo.nypl.org/), you can use your API token to download high-resolution original images. The `nypl_recrop.rb` script reads the token from the `NYPL_API_TOKEN` environment variable, and takes a Stereogranimator image ID as an argument, downloading the original TIFF image and using `template_matching_multiscale.rb` to calculate the crop and split the image into two views at full resolution. The `run-stereogranimator-hi-res.sh` script uses this process instead of `wget` with low-resolution GIFs. You can also pass the `NYPL_API_TOKEN` environment variable [in your `docker run` command](https://docs.docker.com/engine/reference/run/#/env-environment-variables).
 
 ## Docker Usage
 
diff --git a/nypl_recrop.rb b/nypl_recrop.rb
new file mode 100755
index 0000000..4ca1843
--- /dev/null
+++ b/nypl_recrop.rb
@@ -0,0 +1,66 @@
+#!/usr/bin/env ruby
+
+require 'json'
+require 'rest-client'
+require 'dimensions'
+
+NYPL_API_TOKEN = ENV["NYPL_API_TOKEN"]
+NYPL_AUTH = "Token token=\"#{NYPL_API_TOKEN}\""
+NYPL_ENDPOINT = "http://api.repo.nypl.org/api/v1/items"
+
+stereo_metadata = JSON.parse(RestClient.get("http://stereo.nypl.org/view/#{ARGV[0]}.json"))
+
+unless stereo_metadata['external_id'] == 0
+  abort('Image must be from NYPL collections.')
+end
+
+digital_id = stereo_metadata['digitalid'].upcase
+image_id = JSON.parse(RestClient.get("#{NYPL_ENDPOINT}/local_image_id/#{digital_id}", :Authorization => NYPL_AUTH))
+image_uuid = image_id['nyplAPI']['response']['uuid']
+image_captures = JSON.parse(RestClient.get("#{NYPL_ENDPOINT}/#{image_uuid}", :Authorization => NYPL_AUTH))
+
+matching_captures = image_captures['nyplAPI']['response']['capture'].select{|c| c['imageID'].upcase == digital_id}
+
+if matching_captures && matching_captures.length > 0
+  # capture_uuid = matching_captures[0]['uuid']
+  # capture_details = JSON.parse(RestClient.get("#{NYPL_ENDPOINT}/item_details/#{capture_uuid}", :Authorization => NYPL_AUTH))
+  highres_url = matching_captures[0]['highResLink']
+  lowres_url = stereo_metadata['url']
+
+  # download images
+  $stderr.puts "Downloading images..."
+  `wget -nc -O #{ARGV[0]}.jpg '#{lowres_url}'`
+  `wget -nc -O #{ARGV[0]}.tif '#{highres_url}'`
+
+  # calculate the crop for the original image using multiscale template matching
+  $stderr.puts "Calculating crop..."
+  crop_params = `./template_match_multiscale.py --template #{ARGV[0]}.jpg --image #{ARGV[0]}.tif`.chomp
+
+  # apply the crop
+  `convert #{ARGV[0]}.tif -crop #{crop_params} +repage #{ARGV[0]}_cropped.tif`
+
+  # calculate dimensions
+  lowres_dims = Dimensions.dimensions("#{ARGV[0]}.jpg")
+  highres_dims = Dimensions.dimensions("#{ARGV[0]}_cropped.tif")
+
+  # calculate scaling
+  x_scale = highres_dims[0].to_f / lowres_dims[0].to_f
+  y_scale = highres_dims[1].to_f / lowres_dims[1].to_f
+
+  # calculate scaled dimensions
+  cropped_width = stereo_metadata['width'] * x_scale
+  cropped_height = stereo_metadata['height'] * y_scale
+  x1 = stereo_metadata['x1'] * x_scale
+  x2 = stereo_metadata['x2'] * x_scale
+  y1 = stereo_metadata['y1'] * y_scale
+  y2 = stereo_metadata['y2'] * y_scale
+
+  # use the scaled dimensions to split the cropped original into the component images
+  $stderr.puts "Cropping image..."
+  `convert #{ARGV[0]}_cropped.tif -crop #{cropped_width}x#{cropped_height}+#{x1}+#{y1} +repage #{ARGV[0]}_0.png`
+
+  `convert #{ARGV[0]}_cropped.tif -crop #{cropped_width}x#{cropped_height}+#{x2}+#{y2} +repage #{ARGV[0]}_1.png`
+else
+  puts image_captures.to_json
+  abort("No matching captures for #{digital_id}")
+end
diff --git a/run-sterogranimator-hi-res.sh b/run-sterogranimator-hi-res.sh
new file mode 100755
index 0000000..9e18d71
--- /dev/null
+++ b/run-sterogranimator-hi-res.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+bundle exec ./nypl_recrop.rb $1 && ./run-torchwarp.sh $1
diff --git a/template_match_multiscale.py b/template_match_multiscale.py
new file mode 100755
index 0000000..4c104af
--- /dev/null
+++ b/template_match_multiscale.py
@@ -0,0 +1,108 @@
+#!/usr/bin/env python
+# USAGE
+# python template_match_multiscale.py --template template.png --image image.tif
+# Adapted from: http://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/
+
+# import the necessary packages
+import numpy as np
+import argparse
+import glob
+import cv2
+
+def resize(image, width = None, height = None, inter = cv2.INTER_AREA):
+	# initialize the dimensions of the image to be resized and
+	# grab the image size
+	dim = None
+	(h, w) = image.shape[:2]
+
+	# if both the width and height are None, then return the
+	# original image
+	if width is None and height is None:
+		return image
+
+	# check to see if the width is None
+	if width is None:
+		# calculate the ratio of the height and construct the
+		# dimensions
+		r = height / float(h)
+		dim = (int(w * r), height)
+
+	# otherwise, the height is None
+	else:
+		# calculate the ratio of the width and construct the
+		# dimensions
+		r = width / float(w)
+		dim = (width, int(h * r))
+
+	# resize the image
+	resized = cv2.resize(image, dim, interpolation = inter)
+
+	# return the resized image
+	return resized
+
+# construct the argument parser and parse the arguments
+ap = argparse.ArgumentParser()
+ap.add_argument("-t", "--template", required=True, help="Path to template image")
+ap.add_argument("-i", "--image", required=True,
+	help="Path to image where template will be matched")
+ap.add_argument("-v", "--visualize",
+	help="Flag indicating whether or not to visualize each iteration")
+args = vars(ap.parse_args())
+
+# load the image image, convert it to grayscale, and detect edges
+template = cv2.imread(args["template"])
+template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
+template = cv2.Canny(template, 50, 200)
+(tH, tW) = template.shape[:2]
+cv2.imshow("Template", template)
+
+imagePath = args["image"]
+# load the image, convert it to grayscale, and initialize the
+# bookkeeping variable to keep track of the matched region
+gray = cv2.imread(imagePath, 0)
+found = None
+
+# loop over the scales of the image
+for scale in np.linspace(0.2, 1.0, 20)[::-1]:
+  # resize the image according to the scale, and keep track
+  # of the ratio of the resizing
+  resized = resize(gray, width = int(gray.shape[1] * scale))
+  r = gray.shape[1] / float(resized.shape[1])
+
+  # if the resized image is smaller than the template, then break
+  # from the loop
+  if resized.shape[0] < tH or resized.shape[1] < tW:
+    break
+
+  # detect edges in the resized, grayscale image and apply template
+  # matching to find the template in the image
+  edged = cv2.Canny(resized, 50, 200)
+  result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
+  (_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)
+
+  # check to see if the iteration should be visualized
+  if args.get("visualize", False):
+    # draw a bounding box around the detected region
+    clone = np.dstack([edged, edged, edged])
+    cv2.rectangle(clone, (maxLoc[0], maxLoc[1]),
+      (maxLoc[0] + tW, maxLoc[1] + tH), (0, 0, 255), 2)
+    cv2.imshow("Visualize", clone)
+    cv2.waitKey(0)
+
+  # if we have found a new maximum correlation value, then ipdate
+  # the bookkeeping variable
+  if found is None or maxVal > found[0]:
+    found = (maxVal, maxLoc, r)
+
+# unpack the bookkeeping varaible and compute the (x, y) coordinates
+# of the bounding box based on the resized ratio
+(_, maxLoc, r) = found
+(startX, startY) = (int(maxLoc[0] * r), int(maxLoc[1] * r))
+(endX, endY) = (int((maxLoc[0] + tW) * r), int((maxLoc[1] + tH) * r))
+
+print "%dx%d+%d+%d" % ((endX - startX), (endY - startY), startX, startY)
+
+# draw a bounding box around the detected result and display the image
+# cv2.rectangle(gray, (startX, startY), (endX, endY), (0, 0, 255), 2)
+# cv2.imshow("Image", gray)
+# cv2.waitKey(0)
author	Ryan Baumann <ryan.baumann@gmail.com>	2016-08-23 16:14:48 -0400
committer	Ryan Baumann <ryan.baumann@gmail.com>	2016-08-23 16:14:48 -0400
commit	48b73c35c312d9a00d94f34bffc7d5e1a8f10904 (patch)
tree	696ce996ed06ea5cd4532e4df476806911b76200
parent	acb4c878bde601de3c792fed38198ad03f69b21e (diff)