A few things to get you started:
1) EUI, kWh/sf, W/sf, therms/sf, $/sf -- all of these should be benchmarked as if the building was currently in operation, do the values make sense?
--- also the consumption values may change seasonally, do the patterns make sense?
--- cfm/ft2 match expected values for auto-sized systems.
2) Does major HVAC equipment operate throughout it's capacity? (i.e. reconcile designed capacities with modeled loads)
--- check auto-sized capacities against ASHRAE rules of thumb if not using designed capacities.
--- if you have designed values or actual values, verify against those capacities for major HVAC equipment
3) If the modeling package includes a rendering feature, does it match the architect's drawings for new buildings or photos of the actual site for existing buildings.
4) See previous thread on this site about calibration methods for existing buildings...this is an area with a wealth of resources.
5) For new buildings, do the results track with the project's goals? If not that will need to be resolved either by changes to the model or revising the goals.
6) Energy end-use breakdown should be compared to other projects, variances explained.
7) Calculate average annual efficiencies for primary equipment, are they reasonable?
8) Unmet hours, and/or zone temperature reports used to show that conditions are met appropriately. Verify that air temperatures and water temperatures are reasonable.
9) Verify that correct weather file was used, any elevation adjustments are made if needed.
10) Verify modeled building area against the design or actual area. Don't count the plenums or dummy zones.
11) Assess any key assumptions that were made. Perform sensitivity analysis to determine what effect might result if the assumption was too high/low/etc.
Lots of other factors that might be specific to your model. Too many to list!
As far as someone else's model, I'd ask them some of these same questions.
Thanks @rdzeldenrust for asking this question.
I am facing the same dilemma with my model in OpenStudio. I ran a PAT(Parametric Analysis Tool) analysis too trying different building orientations but the results for each alternative differed negligibly. There is definitely something wrong with my baseline model. What do you guys think? What should I go back and check (in OpenStudio specifically)?