More on WARP Data (Part Two)

Yesterday, I wrote about “Densities and Visibility” and “Hidden Landscapes” with regard to the data generated by the Western Argolid Regional Project. Today, I am going to write up four more aspects of our virtual study season in what should be the final installment of WARP related writing this summer. (You can read the rest here, herehere, and here). 

Next week, I return to Cyprus (at least in my writing and reading), but for now, WARP is the place.

Here are the final four observations on the WARP data crunching season.  

3. Land Use. As part of our standard descriptions of each survey unit, we recorded a good bit of land use data. This includes things like dominant vegetation, evidence for recent plowing, and the presence of features such as terrace walls that indicate material investment in the landscape. We initially chose to record this kind of information to provide insights to artifact recovery rates, but we soon discovered that this data also provided a high resolution perspective on contemporary (and recent) land use in the Inachos valley. 

For example, it became clear from our data that three main commercial crops in the Inachos valley were olives, stone fruit (primarily apricots), and citrus. While olives are more or less ubiquitous throughout the survey area, citrus tend to only appear in units under 150 masl in elevation. Apricots appear in units under 200 masl leaving units of 250 masl and higher in elevation to olives. This likely has to do with the susceptibility of these crops to frost damage. The presence of windmill-like air circulators in the citrus fields to the northwest of Argos confirm that this territory receives some manageable frost during the winter months. More durable apricots, a major export crop in our survey area, can endure occasional frosts, but are less rugged than the ubiquitous Greek olive tree. 

Evidence for plowing tends to be most common at fields under 200 masl which have, for our survey area, moderate slopes of less than 13 degrees although fields with compacted soils that show some evidence for plowing in the recent past (which we call “plowed, compacted” soils) extend slightly higher in elevation (and average 215 masl) and with slightly greater slopes of an average of 14 degrees. Higher elevations and greater slopes than these tend not to see regular plowing and are characterized by unplowed field even when the erosion of soils on slopes exceeding 20 degrees creates loose soil conditions. It should come as little surprise that units with plowed and loose soils tend to have the highest artifact densities and the highest visibilities. It is worth noting, however, that units with plowed and compacted soils, which indicate recent, but not ongoing plowing, produced the higher densities than predicted by visibility alone. 

A significant network of terrace walls manage the sloping walls of the Inachos river valley, and we recorded over 850 units with terrace walls. This is over 10% of all survey units. The average elevation of a terraced field is 247 masl with no terraces appearing below 108 masl and the highest over 500 masl. The average slope of a terraced field was 26 degrees. Predictably, most of the higher terraced fields (over 245 masl) were not plowed and these tended to have steeper slopes. There were, however, a number of recently or currently plowed terraced fields at lower elevations and slightly lower slopes suggesting that access rather than elevation and slope alone determined whether terraced fields saw plowing. In fact, these recently or currently plowed terraced fields tended to produce much higher artifact densities than visibility alone would predict whereas unplowed terraced fields tended to perform closer to what one would expect based on their visibility. This almost certainly reflects the high artifact densities from fields surrounding the ancient acropolis of Orneai which is also in the immediate vicinity of the village of Lyrkeia and accessible via a network of paved and field roads.      

4. Describing Chronological Landscapes. Over the last week or so, the project directors have been thinking about how best to describe the distribution of material from various periods across the entire landscape. This is distinct from how we interpret or understand the historical significance of particular patterns in the landscape. Instead, the idea (to my mind) is to describe the distribution of material in a consistent way across the entire survey area that allows for at least basic comparisons.

On the most basic level we can compare the character of assemblages by number of artifacts alone, but this speaks very little to the distribution of artifacts across our survey area. Thus combining the number of artifacts with the area of the units in which they appear helps to give some sense of distribution. David Pettegrew in his recent (unpublished) analysis of the distribution of EKAS data used nearest neighbor analysis (based, I believed on the centroid of units) determine whether the pattern produced by artifacts from various periods is clustered or dispersed. The vagaries of artifact recovery patterns could, I imagine, be managed by comparison with the overall pattern of the survey which would allow us to say whether the overall distribution of artifacts from a particular period is more or less dispersed than the overall distribution of all artifacts from the survey (imagining that the latter reflects recovery rates). 

Obviously one challenge here is the differential visibility or diagnosticity of particular periods on the surface. Certain periods – such as Pettegrew famously argued for the Late Roman period in Greece – are more visible than others complicating a simple reading of distributional analysis as a measure for (say) the character of settlement in the survey area. The other challenge, of course, is the different date ranges for various periods which mean that comparing, say, the Late Roman period (which we date to AD300-AD700) tends to be a good bit longer than, say, the Classical period (450BC-300BC) which means that the Late Roman assemblage has had twice as long to develop in the landscape.

There are various ways to manage for the differential diagnosticity and the different length of various periods to make these assemblages comparable. I tend to be fairly pessimistic about the potential of comparing assemblages from different periods. In other words, I think it is pretty hard to make arguments for the expansion or contraction of settlement by comparing assemblages from two different periods unless one establishes that the material signature of the two periods is fundamentally comparable.

That said, I suspect that the distribution patterns of material from various period between different survey projects is likely to be more comparable than between periods in the same survey project. For example, issues of differential visibility or diagnosticity on the surface tend to be common to most survey projects in a region and in most cases periodization schemes are, if not absolutely the same, at least broadly consistent at a regional level. In other words, being able to describe the various period landscapes across the survey area serves as the basis for later analysis of the periods in question rather than the analysis, necessarily, of the survey area across time (although it should also inform how we understand the survey area diachronically).  

5. Chasing the Data. One thing that crunching data does reveal is the strengths and weakness of any dataset. Our dataset is quite a way from what I would consider big data and as a consequence little problems with our data can create big issues during analysis. (And here I’m assuming that the strength of big data schemes is that small imperfections or outliers in the data set tend be washed out by the scale of the data more generally, for better or worse). As I ran queries and did analyses and produced new datasets on the basis of data that we collected in the field, I discovered little problems. For example, the aoristic analysis that I posted last week was based on a chronology table that had the Archaic period dating from 750BC-AD450 rather than 750BC-450BC. This is meant that pottery dated to the Archaic period was rather significantly underrepresented in the aoristic analysis that I conducted. It is an easy enough fix, fortunately, one that probably would have become clear at some point in the publication process.

At the same time, doing the work of analyzing our material is part of what brings various limitations to our data to the fore. For example, we didn’t ask our field teams to record the presence of terrace walls. So I had to excavate this data from the a more general comment field. This was easy enough to do, of course, but I suspect that the dataset is a bit fuzzier around the edges than one generated by a simple check box. 

In the end, querying the data will both reveal its analytical limits and make it a stronger dataset. This kind of “slow data” work is both humbling, in that it reveals the limits of data collection processes in the field, and energizing in that it only through analysis do we recognize the potential of our data to reveal more about the landscape than we had intended.

6. Solitary Data Crunching. Finally, crunching data by myself has been pretty boring. One of the great things about study seasons is not so much the work of study, but the time to reflect, ask questions, make false starts, share processes, and think out loud (although my colleagues might not entirely agree about that last one!).

Crunching data alone in my home office feels so disconnected from the work of the survey. I’m left to my own devices and my own questions, I often end up spinning my wheels or working my way into a dead end of data which does neither speaks to whatever hypothesis that I have imagined nor leads me to new questions. 

When doing data crunching next to my (often much smarter) colleagues, however, I constantly encounter new ways of seeing the data and imagining how it speaks to the archaeological landscapes that we explored together. In that context, data oriented study seasons often led to trips through the survey area (and surrounding regions), shared memories and reflections on units and field practices, and deeper engagements with both the landscape and our data.

Data-ing alone, on the other hand, has made me feel not only a bit detached from the survey universe, but also mildly confounded by our data. Hopefully before we get to the publication stage, we’ll have time to revisit our data together in a more collaborative and conversational way, but for now, this is what we have and despite it being a bit uncomfortable, I think I’ve made a bit of progress. 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s