Mapping the City Via Moving Images: A Brief Review and Discussions about the ‘Documentary’ Approach

Mapping is a powerful and productive instrument for construing and comprehending the world. Conventional approaches to mapping project data from a top-down view to unfold the city’s structure or spatial distribution. Abstract vistas from above help professionals handle the physical realm conveniently. However, they may have disadvantages in explaining the richness and complexity of the city that corresponds to human experience.

Mapping methods to extract information from perspective-view-images are complementary to conventional approaches. Since the paradigmatic shift in urban design promoted by Kevin Lynch, Gordon Cullen and other scholars in the 1960s (Oliveira, 2016, p.92), visual form and daily experience in urban space have become concerns of researchers and designers. From this perspective, images are an ideal medium of recording urban space and mapping diversified form and sociocultural aspects of the city that are revealed by means of visual perception. In particular, mapping the city through moving images emphasizes the dynamic experience of the observer and reveals the relationship between successive scenes in the urban context.

We can find prototypes of using moving images to describe and analyze the form of the city in the innovative experiments conducted by Venturi, Scott Brown, and their students in Learning from Las Vegas Research Studio in 1968. In order to show the dominant feature of the city’s landscape from different perspectives, the team took four representative films titled Las Vegas Deadpan, Las Vegas Strip LfLV Studio (Day: Night), Las Vegas Electric, and Las Vegas Helicopter Ride (Stierli, 2013, pp. 156-160). Las Vegas Helicopter Ride is from a bird’s-eye view to show the dominant feature of urban space. Las Vegas Deadpan and Las Vegas StripLfLV Studio are from the perspective of a viewer in a car to visually represent the car-oriented urban form. The most exceptional one is Las Vegas Electric that renders glittering nightscapes while shifting from the perspective of a spectator in an automobile to that of a pedestrian. Besides these four films, Venturi and Scott Brown reproduced the ‘EdwardRuscha’ elevation[1] in Las Vegas using “continuous motorized photos” (Venturi, Scott Brown,& Izenour, 2000, p. 32). Both Las Vegas Deadpan and the sequential photographs of ‘Edward Ruscha’ elevation are documentary ways of presenting the Las Vegas Strip. A camera was mounted on the hood for the former one and parallel to side windows for the latter one.

Based on these experiments, moving images show great potential as a research tool to visualize the characteristics of the physical environment. By working as a replacement of human’s eye, the camera‘s eye records the city from the perspective of a spectator. In particular, there are two principal ways of manipulating the camera to create moving images for different objectives, which are documentary or simulatory[2](Stierli, 2013, p. 42). ‘Simulatory’ is to simulate the moving experience of a viewer and respond to the dynamic changes of the city‘s scenery. In simulatory filming, the focal length and the angle of the camera lens are not constant to imitate human eye wandering, and subjective mapping methods like montage are often used. On the contrary, the ‘documentary’ approach is from an objective view, to achieve which the camera is fixed on a moving device and keeps filming without manipulations of zooming or panning. Moving images are analogous to sequential scenes seen by an ideal gazer. Some researchers have used ‘documentary’ moving images to analyze the urban form perceived by a viewer in motion (Appleyard, Lynch, & Myer, 1964; Tucker, 2005; Ding, 2011). Their studies demonstrate that the‘documentary’ is an objective method and that it allows a quantitative mapping and analysis of urban form. Therefore, this essay only focuses on the ‘documentary’ way and elucidates the critical points of this mapping method.

Setting the viewer

Moving images bear traces of the presence and direct experience of the viewer. However, to use this technique to study the built environment quantitatively, it is necessary to ensure an adequate degree of objectivity. While the ‘documentary‘ approach seems to satisfy this requirement, it nonetheless remains just a “rhetoric of objectivity” decided by the settings of the camera eye (Stierli, 2013, p. 138). Hence, the first critical problem to discuss is represented by the state of the viewer and its influence on moving images. For each picture, the location of the viewer, height of viewpoint and lens focal length decide which portions are framed in the vision field of the camera. The experiment of changing both the location and height of the viewer shows that the streetscape on foot concentrates mostly on the ground floor of buildings, while the streetscape taken from a car contains four or five floors of buildings along the street. Moreover, according to the relationship of view size to camera lens length [3], the field of view taken by a 35 mm camera with a 50mm lens looks ‘natural’ to the observer under normal viewing conditions.

Streetscape as viewed by pedestrians and cars — (Left) Streetscape from pedestrian’s view on the sidewalk; (Right) Streetscape from a Google Street View Car. Author: Zhouyan Wu.

The relationship between the field of view and camera settings — The relationship between the field of view taken by a 35mm camera with a 50mm lens and a 27mm lens at the same height. Author: Zhouyan Wu.

Since Learning from Las Vegas Research Studio, where a camera was mounted on a car to film urban space, technical apparatuses have incredibly evolved. Technologies like dashcam, Street View car, Trekker Street View backpack, or sports camera can film moving images at different heights and speeds. More recently, Unmanned Aerial Vehicles (drones) have been used to collect massive data and survey the urban environment from a higher viewpoint.

A student taking streetscape pictures with a Trekker Street View backpack in an alley in Tokyo. Author: Zhouyan Wu.

The author taking moving images of Ginza main street with a GoPro. Author: Zhouyan Wu.

The angle between the camera eye and the direction of movement also determines how moving images reproduce the urban space. Based on the previous deadpanning [4] of Las Vegas, there are two fundamental ways to collect and organize ‘documentary’ moving images. One is to set a spectator looking towards the building-street interface – thus at a ninety-degree angle to the moving direction – and merge serial photographs into a horizontal scroll of elevation views. In this case, moving images are taken from a third-person point of view. The other one is to orient the viewer in the same direction of travel and construct a sequence of perspectives by a first-person shot. As a result, this second approach can render more accurately the experience of a traveler walking or driving in the city.

various mappings — (Left) A horizontal scroll made by moving in elevation view; (right) a sequence made by moving in perspective view. Author: Zhouyan Wu.

Speed and resolution

The resolution of mapping acquired through moving images from a perspective view relates to the speed of movement. In the book Learning from Las Vegas, Venturi and Scott Brown (2000, p.11) illustrate the relationship between space, speed and the size of signboards. Results show that only large-sized objects are noticeable for the viewer moving in a car. Instead, to capture pedestrians‘ point of view, a mapping method in a higher resolution is essential.

Rapoport (1990) discusses changes in the perception of the built environment according to different speeds. When the speed increases, fewer details can be noticed even on a certain street segment. At walking speed, people receive more high-resolution information from the environment compared with at driving speed. Likewise, Moudon (1991) also emphasizes that the rate of information, in other words, the number of perceptible differences per unit of time, is higher for pedestrians. Consequently, to map smaller physical elements, it is necessary to film moving images at a low speed.

Finally, the process of exporting moving images in a given software environment also determines the resolution of the map. Since moving images are sequential frames exported from a clip that has been recorded with a specified set of frame per second (fps), the rate used for extraction can be different from the original fps settings. In an ideal situation, by decreasing the extraction rate from a clip filmed at a low speed, the exported sequence of pictures is equivalent to the one that is taken at high speed but exported at the original speed. This result proves that the resolution of mapping is adjustable even after taking the original video.

Sequences exported at different fps from the same film taken while walking: (left) 1 fps = walking; (middle) 3 fps = cycling; (right) 12 fps = driving. Author: Zhouyan Wu.

Spatiotemporal sequence

How to organize the moving images is the final critical step of this mapping technique. One of the earliest trials is The Concise Townscape by Gordon Cullen, where he vividly presents the visual experience of walking through a group of buildings (1995). He argues that the visual perception splits the scene into two parts: the existing view and the emerging view. At the same time, the temporal dimension of movement intensifies the impact of the perception. However, Cullen’s work stops at a qualitative description. For a quantitative analysis of the built environment, how can we place the sequence on a “flat and scaled” “surface”[5] that can be used to understand the position and the proportion among the various elements captured in the image?

To tackle this problem, in the book of The View from the Road (1964), Appleyard, Lynch, and Myer invented a technique to arrange image sequences in vertical progression. In contrast with Cullen’s serial vision, here the temporal order is organized from the lower edge of one page to its upper edge, at a uniform pace. By creatively using the page to display the moving experience, one vertically upward arrow is put at the bottom to guide readers and indicate a forward motion to the place ahead. By considering that the lower edge of the paper is closer to our body, and thus symbolizes our present location, the eyes are moving upwards while simulating the movement in space and time. Following Bosselmann‘s interpretation (1998), this technique is similar to that employed in western art to conventionally depict in the upper part of a picture unknown conditions, the future, hopes, and expectations.

Another approach to mapping sequential pictures along one route in the city is offered by Ding (2011). Based on Minkowski’s coordinates, she constructs a method to present the spatial characteristics of urban form. In particular, Ding proposes a time-space system, whose vertical coordinate is the time axis (t) and the horizontal coordinate is the street width axis (x). By using this method, it is possible to represent the relationship between velocity and the perceived street pattern.

Towards a new research tool

Based on the attempts briefly discussed in this article, moving images is a promising tool to map the urban space from the viewpoint of a spectator traveling in the city. On the one hand, it can be used to measure and analyze urban form at a higher resolution. At the same time, it renders sequential perceptions of urban space. However, the potential of this technique to reveal patterns of human activities is still insufficiently investigated. This gap shows the urgency for a new quantitative tool that can integrate the analysis of urban form with the discourse about the public realm.

[1] Scott Brown was inspired by the pop art artist Edward Ruscha and his work Every Building on the Sunset Strip. She used “Edward Ruschaelevation” in the citation of the photo collage of building elevations to refer to the specific representational method.

[2] Martino Stierli has defined the ‘double image’ of the city, a documentary image developed by single illustrations and photo sequences, and a simulatory image developed by photo collages. This essay borrows the dichotomy of documentary and simulatory.

[3] The website of Panasonic presents the relationship of focal length to the viewing angle. Source: Panasonic.

[4] Deadpanning means filming without changing the angle or the focal length of the camera.

[5] James Corner deconstructs mapping operations into “fields, extracts, and plottings”. The field is the surface to plot isolated information. See Corner, 1999, p. 229.

References

Appleyard, D., Lynch, K., & Myer, J. R. (1964). The view from the road.Cambridge (Mass.): M.I.T. Press.

Bosselmann, P. (1998). Representation of places: reality and realism in city design. Berkeley: University of California Press.

Corner, J. (1999). The Agency of Mapping: Speculation, Critique and Invention. In D. Cosgrove, Mappings (pp. 213–252). London: Reaktion.

Ding, W. (2011). Mapping Urban Spaces: Moving Image as a Research Tool. In Penz, F. & Lu, A. (Eds.), Urban cinematics: understanding urban phenomena through the moving image (pp. 315-355). Bristol, UK Chicago, USA:Intellect.

Moudon, A. V. (Ed.). (1991). Public streets for public use (Morningside ed). New York: Columbia University Press.

Oliveira, V. (2016). Urban morphology: an introduction to the study of physical form of cities. Switzerland: Springer.

Rapoport, A. (1990). History and precedent in environmental design. New York: Plenum Press.

Stierli, M. (2013). Las Vegas in the rearview mirror: the city in theory, photography, and film. Los Angeles, Calif: Getty Research Institute.

Tucker, C., Ostwald, M., Chalup, S., & Marshall, J. (2005). A method for the visual analysis of the streetscape.

Venturi, R., Scott Brown, D., & Izenour, S. (2000). Learning from Las Vegas: the forgotten symbolism of architectural form (17th print). Cambridge, Mass.: The MIT Press.

Setting the viewer

Speed and resolution

Spatiotemporal sequence

Towards a new research tool

Share this: